mirror of
https://github.com/postgres/postgres.git
synced 2025-10-25 13:17:41 +03:00
Move the checks out of the Makefile into a perl script that can be called from both the Makefile and meson.build. The set of files checked is simplified, so it is just all the sgml and xsl files found in docs/src/sgml directory tree. Along the way make some adjustments to .cirrus.tasks.yml to support this better in CI. Also ensure that the checks are part of the Makefile's html target. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Co-Author: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/CAN55FZ3BnM+0twT-ZWL8As9oBEte_b+SBU==cz6Hk8JUCM_5Wg@mail.gmail.com
<!-- doc/src/sgml/README.non-ASCII -->
Representation of non-ASCII characters
--------------------------------------
Find non-ASCII characters using:
grep --recursive --color='auto' -P '[\x80-\xFF]' .
Convert to HTML4 named entity (&) escapes
-----------------------------------------
We support several output formats:
* html (supports all Unicode characters)
* man (supports all Unicode characters)
* pdf (supports only Latin-1 characters)
* info
While some output formatting tools support all Unicode characters,
others only support Latin-1 characters. Specifically, the PDF rendering
engine can only display Latin-1 characters; non-Latin-1 Unicode
characters are displayed as "###".
Therefore, in the SGML files, we can only use Latin-1 characters. We
can use UTF8 representations of Latin-1 characters, or HTML entities of
Latin-1 characters, e.g., Álvaro.
Do not use UTF numeric character escapes (&#nnn;).
When building the PDF docs, problem characters will appear as warnings.
HTML entities
official: http://www.w3.org/TR/html4/sgml/entities.html
one page: http://www.zipcon.net/~swhite/docs/computers/browsers/entities_page.html
other lists: http://www.zipcon.net/~swhite/docs/computers/browsers/entities.html
http://www.zipcon.net/~swhite/docs/computers/browsers/entities_page.html
https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references