With the 'noError' argument, you can try to convert a buffer without
knowing the character boundaries beforehand. The functions now need to
return the number of input bytes successfully converted.
This is is a backwards-incompatible change, if you have created a custom
encoding conversion with CREATE CONVERSION. This adds a check to
pg_upgrade for that, refusing the upgrade if there are any user-defined
encoding conversions. Custom conversions are very rare, there are no
commonly used extensions that I know of that uses that feature. No other
objects can depend on conversions, so if you do have one, you can fairly
easily drop it before upgrading, and recreate it after the upgrade with
an updated version.
Add regression tests for built-in encoding conversions. This doesn't cover
every conversion, but it covers all the internal functions in conv.c that
are used to implement the conversions.
Reviewed-by: John Naylor
Discussion: https://www.postgresql.org/message-id/e7861509-3960-538a-9025-b75a61188e01%40iki.fi
The code for conversions SQL_ASCII <-> MULE_INTERNAL and
SQL_ASCII <-> UTF8 was unreachable, because we long ago changed
the wrapper functions pg_do_encoding_conversion() et al so that
they have hard-wired behaviors for conversions involving SQL_ASCII.
(At least some of those fast paths date back to 2002, though it
looks like we may not have been totally consistent about this until
later.) Given the lack of complaints, nobody is dissatisfied with
this state of affairs. Hence, let's just remove the unreachable code.
Also, change CREATE CONVERSION so that it rejects attempts to
define such conversions. Since we consider that SQL_ASCII represents
lack of knowledge about the encoding in use, such a conversion would
be semantically dubious even if it were reachable.
Adjust a couple of regression test cases that had randomly decided
to rely on these conversion functions rather than any other ones.
Discussion: https://postgr.es/m/41163.1559156593@sss.pgh.pa.us
Since some preparation work had already been done, the only source
changes left were changing empty-element tags like <xref linkend="foo">
to <xref linkend="foo"/>, and changing the DOCTYPE.
The source files are still named *.sgml, but they are actually XML files
now. Renaming could be considered later.
In the build system, the intermediate step to convert from SGML to XML
is removed. Everything is build straight from the source files again.
The OpenSP (or the old SP) package is no longer needed.
The documentation toolchain instructions are updated and are much
simpler now.
Peter Eisentraut, Alexander Lakhin, Jürgen Purtz
IDs in SGML are case insensitive, and we have accumulated a mix of upper
and lower case IDs, including different variants of the same ID. In
XML, these will be case sensitive, so we need to fix up those
differences. Going to all lower case seems most straightforward, and
the current build process already makes all anchors and lower case
anyway during the SGML->XML conversion, so this doesn't create any
difference in the output right now. A future XML-only build process
would, however, maintain any mixed case ID spellings in the output, so
that is another reason to clean this up beforehand.
Author: Alexander Lakhin <exclusion@gmail.com>
For DocBook XML compatibility, don't use SGML empty tags (</>) anymore,
replace by the full tag name. Add a warning option to catch future
occurrences.
Alexander Lakhin, Jürgen Purtz
DocBook XML is superficially compatible with DocBook SGML but has a
slightly stricter DTD that we have been violating in a few cases.
Although XSLT doesn't care whether the document is valid, the style
sheets don't necessarily process invalid documents correctly, so we need
to work toward fixing this.
This first commit moves the indexterms in refentry elements to an
allowed position. It has no impact on the output.
There is what may actually be a mistake in our markup. The problem is
in a situation like
<para>
<command>FOO</command> is ...
there is strictly speaking a line break before "FOO". In the HTML
output, this does not appear to be a problem, but in the man page
output, this shows up, so you get double blank lines at odd places.
So far, we have attempted to work around this with an XSL hack, but
that causes other problems, such as creating run-ins in places like
<acronym>SQL</acronym> <command>COPY</command>
So fix the problem properly by removing the extra whitespace. I only
fixed the problems that affect the man page output, not all the
places.
The endterm attribute is mainly useful when the toolchain does not support
automatic link target text generation for a particular situation. In the
past, this was required by the man page tools for all reference page links,
but that is no longer the case, and it now actually gets in the way of
proper automatic link text generation. The only remaining use cases are
currently xrefs to refsects.
another section if required by the platform (instead of the old way of
building them in section "l" and always transforming them to the
platform-specific section).
This speeds up the installation on common platforms, and it avoids some
funny business with the man page tools and build process.
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
output area as INTERNAL not CSTRING. This is to prevent people from
calling the functions by hand. This is a permanent solution for the
back branches but I hope it is just a stopgap for HEAD.
identifier, while some areas do not.
The attached converts be below to "name":
conversion_name
index_name
The below have an existing, initdb supplied, entity named "name". As
such, it could be confusing for the reader to see that identifier used
in the example.
domainname
typename
Rod Taylor
including:
- replacing all the appropriate usages of <citetitle>PostgreSQL
...</citetitle> with &cite-user;, &cite-admin;, and so on
- fix an omission in the EXECUTE documentation
- add some more text to the EXPLAIN documentation
- improve the PL/PgSQL RETURN NEXT documentation (more work to do here)
- minor markup fixes
Neil Conway
with OPAQUE, as per recent pghackers discussion. I still want to do some
more work on the 'cstring' pseudo-type, but I'm going to commit the bulk
of the changes now before the tree starts shifting under me ...