mirror of
https://github.com/postgres/postgres.git
synced 2025-07-24 14:22:24 +03:00
Add README.non-ASCII to explain non-ASCII doc behavior; some text moved from release.sgml. Change UTF8 SGML characters to use HTML entities. Remove unnecessary UTF8 spaces. Add SVG file check for check-nbsp target. Add dummy 'pdf' Makefile target. Reported-by: Yugo Nagata Discussion: https://postgr.es/m/20241011114122.c90f8a871462da36f2e2afeb@sraoss.co.jp Backpatch-through: master
38 lines
1.4 KiB
Plaintext
38 lines
1.4 KiB
Plaintext
<!-- doc/src/sgml/README.non-ASCII -->
|
|
|
|
Representation of non-ASCII characters
|
|
--------------------------------------
|
|
|
|
Find non-ASCII characters using:
|
|
|
|
grep --recursive --color='auto' -P '[\x80-\xFF]' .
|
|
|
|
Convert to HTML4 named entity (&) escapes
|
|
-----------------------------------------
|
|
|
|
We support several output formats:
|
|
|
|
* html (supports all Unicode characters)
|
|
* man (supports all Unicode characters)
|
|
* pdf (supports only Latin-1 characters)
|
|
* info
|
|
|
|
While some output formatting tools support all Unicode characters,
|
|
others only support Latin-1 characters. Specifically, the PDF rendering
|
|
engine can only display Latin-1 characters; non-Latin-1 Unicode
|
|
characters are displayed as "###".
|
|
|
|
Therefore, in the SGML files, we only use Latin-1 characters. We
|
|
typically encode these characters as HTML entities, e.g., Álvaro.
|
|
It is also possible to safely represent Latin-1 characters in UTF8
|
|
encoding for all output formats.
|
|
|
|
Do not use UTF numeric character escapes (&#nnn;).
|
|
|
|
HTML entities
|
|
official: http://www.w3.org/TR/html4/sgml/entities.html
|
|
one page: http://www.zipcon.net/~swhite/docs/computers/browsers/entities_page.html
|
|
other lists: http://www.zipcon.net/~swhite/docs/computers/browsers/entities.html
|
|
http://www.zipcon.net/~swhite/docs/computers/browsers/entities_page.html
|
|
https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references
|