1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2026-01-26 21:41:34 +03:00
Commit Graph

1375 Commits

Author SHA1 Message Date
Nick Wellnhofer
6e33d136e1 error: Fix initGenericErrorDefaultFunc compatibility macro again
Now it really should work as before.
2025-05-28 14:57:37 +02:00
Nick Wellnhofer
30cf6d0980 parser: Add XML_INPUT_USE_SYS_CATALOG
Also clean up catalog resolution and add error handling using the
global error.

Don't try to look up the resolved URI a second time.

Add some comments. Fix documentation.
2025-05-26 16:51:59 +02:00
Nick Wellnhofer
4dc44c83ab parser: Rework entity boundary check for element content
Only use depth of input stack. This makes the input ID unused
internally.
2025-05-25 14:26:30 +02:00
Nick Wellnhofer
db65b2fc51 SAX1: Align handling of default attributes with SAX2
The SAX1 parser is legacy code, but it seems more maintainable to align
it with SAX2.
2025-05-25 14:26:30 +02:00
Nick Wellnhofer
2f3655c9c3 parser: Pop PEs that start markup declarations explicitly
We currently only handle "Validity constraint: Proper Declaration/PE
Nesting", but we must detect "Well-formedness constraint: PE Between
Declarations" separately:

> The replacement text of a parameter entity reference in a DeclSep must
> match the production extSubsetDecl.

PEs in DeclSeps are PEs that start with a full markup declaration (or
another PE). These are handled in xmParse{Internal|External}Subset. We
set a flag on these PEs and don't close them implicitly in
xmlSkipBlankCharsPE. This will make unterminated declarations in such
PEs cause a parser error. The PEs are closed explicitly in
xmParse{Internal|External}Subset, the only location where they are
allowed to end.
2025-05-25 14:26:30 +02:00
Nick Wellnhofer
dd1961e0d8 valid: Skip more validity checks if not validating 2025-05-25 14:26:30 +02:00
Nick Wellnhofer
fca0860d6c tree: Deprecate public struct members related to DTDs
Let's deprecate these members for now. If these are really used, they
can be undeprecated later.
2025-05-25 14:26:30 +02:00
Nick Wellnhofer
7c9b55356d doc: Document unused error domains 2025-05-19 20:07:54 +02:00
Nick Wellnhofer
7008740a96 parser: Consolidate scanning of XML Names
Use new productions by default.

Fixes #194.
Fixes #364.
See #707.
2025-05-19 19:58:33 +02:00
Nick Wellnhofer
210f5a3746 chvalid: Mark functions as deprecated 2025-05-16 23:27:51 +02:00
Nick Wellnhofer
954aae907d doc: Improve regexp documentation 2025-05-16 21:13:17 +02:00
Nick Wellnhofer
c5b45fbc07 doc: Misc fixes 2025-05-16 19:04:20 +02:00
Nick Wellnhofer
c4926b19d3 codegen: Merge xmlunicode.c into xmlregexp.c
Include generated parts.

Generate xmlChRangeGroups instead of functions for Unicode blocks.
2025-05-16 19:04:20 +02:00
Nick Wellnhofer
4cb767e96e codegen: Only generate tables for character ranges
The rest can be easily maintained manually.
2025-05-16 19:04:20 +02:00
Nick Wellnhofer
6f4b452742 parser: Stop using ctxt->linenumbers
I think this was used to avoid setting the `line` member before it was
added (20+ years ago).
2025-05-16 18:03:12 +02:00
Nick Wellnhofer
a05fa9a905 codegen: Rerun codegen scripts 2025-05-16 18:03:12 +02:00
Nick Wellnhofer
a40f36e7f2 include: Stop using *Ptr typedefs in public headers 2025-05-16 18:03:12 +02:00
Nick Wellnhofer
2d83a84ca6 doc: Misc improvements 2025-05-16 18:03:12 +02:00
Nick Wellnhofer
f0983199e8 html: Map some encodings according to HTML5
Windows-1252 is a superset of ISO-8859-1 and should be used instead.
Same for ASCII.

Also map UCS-2 and UTF-16 to UTF-16LE.
2025-05-12 14:04:30 +02:00
Nick Wellnhofer
628006f457 encoding: Add windows-1252
Fixes #915.
2025-05-12 13:27:22 +02:00
Nick Wellnhofer
f602c0c186 html: Rework serialization of meta encoding attributes
Don't allocate memory.
2025-05-12 00:05:02 +02:00
Nick Wellnhofer
0674ccb7cb html: Stop omitting end tags when serializing
Align with HTML5.
2025-05-11 20:57:07 +02:00
Nick Wellnhofer
05b8fe0a06 html: Don't escape RAWTEXT and PLAINTEXT
Align with HTML5.
2025-05-11 20:57:07 +02:00
Nick Wellnhofer
777e2adf77 io: Consolidate escaping code
Use generated table approach of xmlSerializeText for xmlEscapeText.

Move most code to xmlIO.c.
2025-05-11 20:29:25 +02:00
Nick Wellnhofer
dad1163078 entities: Always replace invalid chars when escaping
The previous refactor painstakingly recreated the different behavior of
separate functions that were merged. It makes

Optimize IS_CHAR check for non-ASCII chars.
2025-05-11 20:29:25 +02:00
Nick Wellnhofer
971038e59f html: Call lower-level escaping functions
Removes the need to pass a document around.
2025-05-11 20:29:25 +02:00
Nick Wellnhofer
63535d3922 tree: Make xmlNodeListGetStringInternal work with escape flags 2025-05-11 20:29:25 +02:00
Nick Wellnhofer
442c1903af doc: Fix some damage from automated conversions
Add some newlines, fix returns.
2025-05-11 20:29:25 +02:00
Nick Wellnhofer
98a61c9dff doc: Fix briefs in tree docs 2025-05-11 20:29:25 +02:00
Nick Wellnhofer
46f05ea4d5 html: Rework meta charset handling
Don't use encoding from meta tags when serializing. Only use the value
in `doc->encoding`, matching the XML serializer. This is the actual
encoding used when parsing.

Stop modifying the input document by setting meta tags before
serializing. Meta tags are now injected during serialization.

Add full support for <meta charset=""> which is also used when adding
meta tags.

Align with HTML5 and implement the "algorithm for extracting a character
encoding from a meta element". Only modify the encoding substring in
Content-Type meta tags.

Only switch encoding once when parsing.

Fix htmlSaveFileFormat with a NULL encoding not to declare a misleading
UTF-8 charset.

Fixes #909.
2025-05-11 20:29:25 +02:00
Nick Wellnhofer
38ea8fa9de doc: Fix varargs 2025-05-06 19:51:38 +02:00
Nick Wellnhofer
9bbffec568 doc: Move brief to top, params to bottom of doc comments 2025-05-06 19:51:38 +02:00
Nick Wellnhofer
ab13fbfd68 doc: Misc fixes to error docs 2025-05-06 19:51:38 +02:00
Nick Wellnhofer
b1685459a3 doc: Misc fixes to xmlsave docs 2025-05-06 19:51:38 +02:00
Nick Wellnhofer
298f70b3d7 doc: Misc fixes to HTML tree docs 2025-05-06 19:51:38 +02:00
Nick Wellnhofer
80b6429fb3 doc: Misc fixes to encoding docs 2025-05-06 19:51:38 +02:00
Nick Wellnhofer
81ac2e27fd doc: Misc fixes to valid docs 2025-05-06 19:51:38 +02:00
Nick Wellnhofer
714decd6d6 doc: Misc fixes to entities docs 2025-05-06 19:51:38 +02:00
Nick Wellnhofer
f38f3e7b25 doc: Misc fixes to IO documentation 2025-05-06 19:51:38 +02:00
Nick Wellnhofer
e6cfd04994 doc: Misc fixes to tree docs 2025-05-06 19:51:38 +02:00
Nick Wellnhofer
1bf44f09ba doc: Misc fixes to parser docs 2025-05-06 19:51:38 +02:00
Nick Wellnhofer
b7274fb02f doc: Misc fixes to HTML parser docs 2025-05-06 19:51:38 +02:00
Nick Wellnhofer
411f30ef2a doc: Don't document legacy HTML parser macros 2025-05-06 19:51:38 +02:00
Nick Wellnhofer
4a01087585 doc: Move parser option docs to enum 2025-05-06 19:51:38 +02:00
Nick Wellnhofer
a449c5fde3 catalog: Deprecate some functions 2025-05-06 19:51:38 +02:00
Nick Wellnhofer
075283d49d xlink: Deprecate remaining public function
This was never finished.
2025-05-06 19:51:38 +02:00
Nick Wellnhofer
2c150e62f5 doc: Formatting fixes 2025-05-02 20:21:39 +02:00
Nick Wellnhofer
08a282f9f7 doc: Doxygen fixes for xmlversion.h 2025-05-02 20:12:52 +02:00
Nick Wellnhofer
e78e05c990 doc: Fix autolinks to functions
Unfortunately, autolinks in .c files aren't converted by Doxygen for
some reason.
2025-05-02 17:45:31 +02:00
Nick Wellnhofer
f7c412874b doc: Remove more comment block headers 2025-05-02 17:41:26 +02:00