The "name" comparison operators now all support collations, making them
functionally equivalent to "text" comparisons, except for the different
physical representation of the datatype. They do, in fact, mostly share
the varstr_cmp and varstr_sortsupport infrastructure, which has been
slightly enlarged to handle the case.
To avoid changes in the default behavior of the datatype, set name's
typcollation to C_COLLATION_OID not DEFAULT_COLLATION_OID, so that
by default comparisons to a name value will continue to use strcmp
semantics. (This would have been the case for system catalog columns
anyway, because of commit 6b0faf723, but doing this makes it true for
user-created name columns as well. In particular, this avoids
locale-dependent changes in our regression test results.)
In consequence, tweak a couple of places that made assumptions about
collatable base types always having typcollation DEFAULT_COLLATION_OID.
I have not, however, attempted to relax the restriction that user-
defined collatable types must have that. Hence, "name" doesn't
behave quite like a user-defined type; it acts more like a domain
with COLLATE "C". (Conceivably, if we ever get rid of the need for
catalog name columns to be fixed-length, "name" could actually become
such a domain over text. But that'd be a pretty massive undertaking,
and I'm not volunteering.)
Discussion: https://postgr.es/m/15938.1544377821@sss.pgh.pa.us
This allows the compiler / linker to mark affected pages as read-only.
There's a fair number of pre-requisite changes, to allow the const
properly be propagated. Most of consts were already required for
correctness anyway, just not represented on the type-level. Arguably
we could be more aggressive in using consts in related code, but..
This requires using a few of the types underlying typedefs that
removes pointers (e.g. const NameData *) as declaring the typedefed
type constant doesn't have the same meaning (it makes the variable
const, not what it points to).
Discussion: https://postgr.es/m/20181015200754.7y7zfuzsoux2c4ya@alap3.anarazel.de
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.
Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code. The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there. BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs. So the
net result is that in about half the cases, such comments are placed
one tab stop left of before. This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.
Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
The new type has the scope of whole the database cluster so it doesn't
behave the same as the existing OID alias types which have database
scope,
concerning object dependency. To avoid confusion constants of the new
type are prohibited from appearing where dependencies are made involving
it.
Also, add a note to the docs about possible MVCC violation and
optimization issues, which are general over the all reg* types.
Kyotaro Horiguchi
strncpy() has a well-deserved reputation for being unsafe, so make an
effort to get rid of nearly all occurrences in HEAD.
A large fraction of the remaining uses were passing length less than or
equal to the known strlen() of the source, in which case no null-padding
can occur and the behavior is equivalent to memcpy(), though doubtless
slower and certainly harder to reason about. So just use memcpy() in
these cases.
In other cases, use either StrNCpy() or strlcpy() as appropriate (depending
on whether padding to the full length of the destination buffer seems
useful).
I left a few strncpy() calls alone in the src/timezone/ code, to keep it
in sync with upstream (the IANA tzcode distribution). There are also a
few such calls in ecpg that could possibly do with more analysis.
AFAICT, none of these changes are more than cosmetic, except for the four
occurrences in fe-secure-openssl.c, which are in fact buggy: an overlength
source leads to a non-null-terminated destination buffer and ensuing
misbehavior. These don't seem like security issues, first because no stack
clobber is possible and second because if your values of sslcert etc are
coming from untrusted sources then you've got problems way worse than this.
Still, it's undesirable to have unpredictable behavior for overlength
inputs, so back-patch those four changes to all active branches.
Previously, casts to name could generate invalidly-encoded results.
Also, make these functions match namein() more exactly, by consistently
using palloc0() instead of ad-hoc zeroing code.
Back-patch to all supported branches.
Karl Schnaitter and Tom Lane
to suppress zero-padding of "name" entries in indexes.
The alignment change is unlikely to save any space, but it is really needed
anyway to make the world safe for our widespread practice of passing plain
old C strings to functions that are declared as taking Name. In the previous
coding, the C compiler was entitled to assume that a Name pointer was
word-aligned; but we were failing to guarantee that. I think the reason
we'd not seen failures is that usually the only thing that gets done with
such a pointer is strcmp(), which is hard to optimize in a way that exploits
word-alignment. Still, some enterprising compiler guy will probably think
of a way eventually, or we might change our code in a way that exposes
more-obvious optimization opportunities.
The padding change is accomplished in one-liner fashion by declaring the
"name" index opclasses to use storage type "cstring" in pg_opclass.h.
Normally btree and hash don't allow a nondefault storage type, because they
don't have any provisions for converting the input datum to another type.
However, because name and cstring are effectively the same thing except for
padding, no conversion is needed --- we only need index_form_tuple() to treat
the datum as being cstring not name, and this is sufficient. This seems to
make for about a one-third reduction in the typical sizes of system catalog
indexes that involve "name" columns, of which we have many.
These two changes are only weakly related, but the alignment change makes
me feel safer that the padding change won't introduce problems, so I'm
committing them together.
the associated datatype as their equality member. This means that these
opclasses can now support plain equality comparisons along with LIKE tests,
thus avoiding the need for an extra index in some applications. This
optimization was not possible when the pattern opclasses were first introduced,
because we didn't insist that text equality meant bitwise equality; but we
do now, so there is no semantic difference between regular and pattern
equality operators.
I removed the name_pattern_ops opclass altogether, since it's really useless:
name's regular comparisons are just strcmp() and are unlikely to become
something different. Instead teach indxpath.c that btree name_ops can be
used for LIKE whether or not the locale is C. This might lead to a useful
speedup in LIKE queries on the system catalogs in non-C locales.
The ~=~ and ~<>~ operators are gone altogether. (It would have been nice to
keep them for backward compatibility's sake, but since the pg_amop structure
doesn't allow multiple equality operators per opclass, there's no way.)
A not-immediately-obvious incompatibility is that the sort order within
bpchar_pattern_ops indexes changes --- it had been identical to plain
strcmp, but is now trailing-blank-insensitive. This will impact
in-place upgrades, if those ever happen.
Per discussions a couple months ago.
characters in all cases. Formerly we mostly just threw warnings for invalid
input, and failed to detect it at all if no encoding conversion was required.
The tighter check is needed to defend against SQL-injection attacks as per
CVE-2006-2313 (further details will be published after release). Embedded
zero (null) bytes will be rejected as well. The checks are applied during
input to the backend (receipt from client or COPY IN), so it no longer seems
necessary to check in textin() and related routines; any string arriving at
those functions will already have been validated. Conversion failure
reporting (for characters with no equivalent in the destination encoding)
has been cleaned up and made consistent while at it.
Also, fix a few longstanding errors in little-used encoding conversion
routines: win1251_to_iso, win866_to_iso, euc_tw_to_big5, euc_tw_to_mic,
mic_to_euc_tw were all broken to varying extents.
Patches by Tatsuo Ishii and Tom Lane. Thanks to Akio Ishida and Yasuo Ohgaki
for identifying the security issues.
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
In the past, we used a 'Lispy' linked list implementation: a "list" was
merely a pointer to the head node of the list. The problem with that
design is that it makes lappend() and length() linear time. This patch
fixes that problem (and others) by maintaining a count of the list
length and a pointer to the tail node along with each head node pointer.
A "list" is now a pointer to a structure containing some meta-data
about the list; the head and tail pointers in that structure refer
to ListCell structures that maintain the actual linked list of nodes.
The function names of the list API have also been changed to, I hope,
be more logically consistent. By default, the old function names are
still available; they will be disabled-by-default once the rest of
the tree has been updated to use the new API names.
rid of the assumption that sizeof(Oid)==sizeof(int). This is one small
step towards someday supporting 8-byte OIDs. For the moment, it doesn't
do much except get rid of a lot of unsightly casts.
array header, and to compute sizing and alignment of array elements
the same way normal tuple access operations do --- viz, using the
tupmacs.h macros att_addlength and att_align. This makes the world
safe for arrays of cstrings or intervals, and should make it much
easier to write array-type-polymorphic functions; as examples see
the cleanups of array_out and contrib/array_iterator. By Joe Conway
and Tom Lane.
Second cut attached. This one just adds a boolean option to the existing
function to indicate that implicit schemas are to be included (or not).
I remembered the docs as well this time :-)
Dave Page
> Changes to avoid collisions with WIN32 & MFC names...
> 1. Renamed:
> a. PROC => PGPROC
> b. GetUserName() => GetUserNameFromId()
> c. GetCurrentTime() => GetCurrentDateTime()
> d. IGNORE => IGNORE_DTF in include/utils/datetime.h & utils/adt/datetim
>
> 2. Added _P to some lex/yacc tokens:
> CONST, CHAR, DELETE, FLOAT, GROUP, IN, OUT
Jan
allows the example in the CREATE SCHEMA ref page to actually work now.
Also, clean up when the transaction that initially creates a temp-table
namespace is later aborted. Simplify internal representation of search
path by folding special cases into the main list.