These do the same thing as the standard isdigit(), isspace(), and
isprint() but with multibyte and encoding support. But all the
callers are only interested in analyzing single-byte ASCII characters.
So this extra layer is overkill and we can replace the uses with the
standard functions.
All the t_is*() functions in ts_locale.c are under scrutiny because
they don't use the common locale provider framework but instead use
the global libc locale settings. For the functions being touched by
this patch, we don't need all that anyway, as mentioned above, so the
simplest solution is to just remove them. The few remaining t_is*()
functions will need a different treatment in a separate patch.
pg_trgm has some compile-time options with macros such as
KEEPONLYALNUM. These are not documented, and the non-default variant
is not supported by any test cases. As part of this undertaking, I'm
removing the non-default variant, as it is in the way of cleanup. So
in this case, the not-KEEPONLYALNUM code path is gone.
Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://www.postgresql.org/message-id/flat/653f3b84-fc87-45a7-9a0c-bfb4fcab3e7d%40eisentraut.org
Run pgindent, pgperltidy, and reformat-dat-files.
This set of diffs is a bit larger than typical. We've updated to
pg_bsd_indent 2.1.2, which properly indents variable declarations that
have multi-line initialization expressions (the continuation lines are
now indented one tab stop). We've also updated to perltidy version
20230309 and changed some of its settings, which reduces its desire to
add whitespace to lines to make assignments etc. line up. Going
forward, that should make for fewer random-seeming changes to existing
code.
Discussion: https://postgr.es/m/20230428092545.qfb3y5wcu4cm75ur@alvherre.pgsql
The t_iseq() macro does not need to be guarded by a character
length check (at least when the comparison value is an ASCII
character, as its documentation requires). Some portions of
contrib/ltree hadn't read that memo, so simplify them.
The last change in gettoken_query,
- else if (charlen == 1 && !t_iseq(state->buf, ' '))
+ else if (!t_iseq(state->buf, ' '))
looks like it's actually a bug fix: I doubt that the intention
was to silently ignore multibyte characters as if they were
whitespace. I'm not tempted to back-patch though, because this
will have the effect of tightening what is allowed in ltxtquery
strings.
Discussion: https://postgr.es/m/2548310.1664999615@sss.pgh.pa.us
Not much to say here --- does what it says on the tin. The "binary"
representation in each case is really just the same as the text format,
though we prefix a version-number byte in case anyone ever feels
motivated to change that. Thus, there's not any expectation of improved
speed or reduced space; the point here is just to allow clients to use
binary format for all columns of a query result or COPY data.
This makes use of the recently added ALTER TYPE support to add binary
I/O functions to an existing data type. As in commit a80818605,
we can piggy-back on there already being a new-for-v13 version of the
ltree extension, so we don't need a new update script file.
Nino Floris, reviewed by Alexander Korotkov and myself
Discussion: https://postgr.es/m/CANmj9Vxx50jOo1L7iSRxd142NyTz6Bdcgg7u9P3Z8o0=HGkYyQ@mail.gmail.com
This is numbered take 7, and addresses a set of issues around:
- Fixes for typos and incorrect reference names.
- Removal of unneeded comments.
- Removal of unreferenced functions and structures.
- Fixes regarding variable name consistency.
Author: Alexander Lakhin
Discussion: https://postgr.es/m/10bfd4ac-3e7c-40ab-2b2e-355ed15495e8@gmail.com
By project convention, these names should include "P" when dealing with a
pointer type; that is, if the result of a GETARG macro is of type FOO *,
it should be called PG_GETARG_FOO_P not just PG_GETARG_FOO. Some newer
types such as JSONB and ranges had not followed the convention, and a
number of contrib modules hadn't gotten that memo either. Rename the
offending macros to improve consistency.
In passing, fix a few places that thought PG_DETOAST_DATUM() returns
a Datum; it does not, it returns "struct varlena *". Applying
DatumGetPointer to that happens not to cause any bad effects today,
but it's formally wrong. Also, adjust an ltree macro that was designed
without any thought for what pgindent would do with it.
This is all cosmetic and shouldn't have any impact on generated code.
Mark Dilger, some further tweaks by me
Discussion: https://postgr.es/m/EA5676F4-766F-4F38-8348-ECC7DB427C6A@gmail.com
ltree/ltree_gist/ltxtquery's headers stores data at MAXALIGN alignment,
requiring some padding bytes. So far we left these uninitialized. Zero
those by using palloc0.
Author: Andres Freund
Reported-By: Andres Freund / valgrind / buildarm animal skink
Backpatch: 9.1-
The tsquery, ltxtquery and query_int data types have a common ancestor.
Having acquired check_stack_depth() calls independently, each was
missing at least one call. Back-patch to 9.0 (all supported versions).
Because of gcc -Wmissing-prototypes, all functions in dynamically
loadable modules must have a separate prototype declaration. This is
meant to detect global functions that are not declared in header files,
but in cases where the function is called via dfmgr, this is redundant.
Besides filling up space with boilerplate, this is a frequent source of
compiler warnings in extension modules.
We can fix that by creating the function prototype as part of the
PG_FUNCTION_INFO_V1 macro, which such modules have to use anyway. That
makes the code of modules cleaner, because there is one less place where
the entry points have to be listed, and creates an additional check that
functions have the right prototype.
Remove now redundant prototypes from contrib and other modules.
Several functions, mostly type input functions, calculated an allocation
size such that the calculation wrapped to a small positive value when
arguments implied a sufficiently-large requirement. Writes past the end
of the inadvertent small allocation followed shortly thereafter.
Coverity identified the path_in() vulnerability; code inspection led to
the rest. In passing, add check_stack_depth() to prevent stack overflow
in related functions.
Back-patch to 8.4 (all supported versions). The non-comment hstore
changes touch code that did not exist in 8.4, so that part stops at 9.0.
Noah Misch and Heikki Linnakangas, reviewed by Tom Lane.
Security: CVE-2014-0064
The Solaris Studio compiler warns about these instances, unlike more
mainstream compilers such as gcc. But manual inspection showed that
the code is clearly not reachable, and we hope no worthy compiler will
complain about removing this code.
The latter was already the dominant use, and it's preferable because
in C the convention is that intXX means XX bits. Therefore, allowing
mixed use of int2, int4, int8, int16, int32 is obviously confusing.
Remove the typedefs for int2 and int4 for now. They don't seem to be
widely used outside of the PostgreSQL source tree, and the few uses
can probably be cleaned up by the time this ships.
After parsing a parenthesized subexpression, we must pop all pending
ANDs and NOTs off the stack, just like the case for a simple operand.
Per bug #5793.
Also fix clones of this routine in contrib/intarray and contrib/ltree,
where input of types query_int and ltxtquery had the same problem.
Back-patch to all supported versions.
unnecessary #include lines in it. Also, move some tuple routine prototypes and
macros to htup.h, which allows removal of heapam.h inclusion from some .c
files.
For this to work, a new header file access/sysattr.h needed to be created,
initially containing attribute numbers of system columns, for pg_dump usage.
While at it, make contrib ltree, intarray and hstore header files more
consistent with our header style.
ways. I'm not totally sure that I caught everything, but at least now they pass
their regression tests with VARSIZE/SET_VARSIZE defined to reverse byte order.
return true for exactly the characters treated as whitespace by their flex
scanners. Per report from Victor Snezhko and subsequent investigation.
Also fix a passel of unsafe usages of <ctype.h> functions, that is, ye olde
char-vs-unsigned-char issue. I won't miss <ctype.h> when we are finally
able to stop using it.
more compliant with the error message style guide. In particular,
errdetail should begin with a capital letter and end with a period,
whereas errmsg should not. I also fixed a few related issues in
passing, such as fixing the repeated misspelling of "lexeme" in
contrib/tsearch2 (per Tom's suggestion).
Christopher Kings-Lynne wrote:
> I'm still getting ltree failures on 64bit freebsd:
>
> sed 's,MODULE_PATHNAME,$libdir/ltree,g' ltree.sql.in >ltree.sql
> gcc -pipe -O -g -Wall -Wmissing-prototypes -Wmissing-declarations -fpic -DPI
> C -DLOWER_NODE -I. -I../../src/include -c -o ltree_io.o ltree_io.c -MMD
> ltree_io.c: In function `ltree_in':
> ltree_io.c:57: warning: int format, different type arg (arg 3)
> ltree_io.c:63: warning: int format, different type arg (arg 4)
> ltree_io.c:68: warning: int format, different type arg (arg 3)
Teodor Sigaev