Historically we tolerated the absence of various C runtime library
features for the benefit of the MinGW tool chain, because it used
ancient msvcrt.dll for a long period of time. It now uses ucrt by
default (like Windows 10+, Visual Studio 2015+), and that's the only
configuration we're testing.
In practice, we effectively required ucrt already in PostgreSQL 17, when
commit 8d9a9f03 required _create_locale etc, first available in
msvcr120.dll (Visual Studio 2013, the last of the pre-ucrt series of
runtimes), and for MinGW users that practically meant ucrt because it
was difficult or impossible to use msvcr120.dll. That may even not have
been the first such case, but old MinGW configurations had already
dropped off our testing radar so we weren't paying much attention.
This commit formalizes the requirement. It also removes a couple of
obsolete comments that discussed msvcrt.dll limitations, and some tests
of !defined(_MSC_VER) to imply msvcrt.dll. There are many more
anachronisms, but it'll take some time to figure out how to remove them
all. APIs affected relate to locales, UTF-8, threads, large files and
more.
Thanks to Peter Eisentraut for the documentation change. It's not
really necessary to talk about ucrt explicitly in such a short section,
since it's the default for MinGW-w64 and MSYS2. It's enough to prune
references and broken links to much older tools.
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/d9e7731c-ca1b-477c-9298-fa51e135574a%40eisentraut.org
<!-- doc/src/sgml/README.non-ASCII -->
Representation of non-ASCII characters
--------------------------------------
Find non-ASCII characters using:
grep --recursive --color='auto' -P '[\x80-\xFF]' .
Convert to HTML4 named entity (&) escapes
-----------------------------------------
We support several output formats:
* html (supports all Unicode characters)
* man (supports all Unicode characters)
* pdf (supports only Latin-1 characters)
* info
While some output formatting tools support all Unicode characters,
others only support Latin-1 characters. Specifically, the PDF rendering
engine can only display Latin-1 characters; non-Latin-1 Unicode
characters are displayed as "###".
Therefore, in the SGML files, we only use Latin-1 characters. We
typically encode these characters as HTML entities, e.g., Álvaro.
It is also possible to safely represent Latin-1 characters in UTF8
encoding for all output formats.
Do not use UTF numeric character escapes (&#nnn;).
HTML entities
official: http://www.w3.org/TR/html4/sgml/entities.html
one page: http://www.zipcon.net/~swhite/docs/computers/browsers/entities_page.html
other lists: http://www.zipcon.net/~swhite/docs/computers/browsers/entities.html
http://www.zipcon.net/~swhite/docs/computers/browsers/entities_page.html
https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references