1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2025-10-23 01:52:48 +03:00
Commit Graph

376 Commits

Author SHA1 Message Date
Nick Wellnhofer
7bd8d1d9cc doc: Prefix autolinks with '#'
Use `#func` instead of `func()` to ignore parameters and make all
autolinks work.
2025-05-28 16:01:52 +02:00
Nick Wellnhofer
30cf6d0980 parser: Add XML_INPUT_USE_SYS_CATALOG
Also clean up catalog resolution and add error handling using the
global error.

Don't try to look up the resolved URI a second time.

Add some comments. Fix documentation.
2025-05-26 16:51:59 +02:00
Nick Wellnhofer
34bafa14fe parser: Use parser context as default in resource loader
This allows to access the original context for example when using
modules like XInclude or schemas.
2025-05-25 22:47:34 +02:00
Nick Wellnhofer
6f4b452742 parser: Stop using ctxt->linenumbers
I think this was used to avoid setting the `line` member before it was
added (20+ years ago).
2025-05-16 18:03:12 +02:00
Nick Wellnhofer
adfbeb7e08 doc: Stop using *Ptr typedefs in documentation 2025-05-16 18:03:12 +02:00
Nick Wellnhofer
a40f36e7f2 include: Stop using *Ptr typedefs in public headers 2025-05-16 18:03:12 +02:00
Nick Wellnhofer
2d83a84ca6 doc: Misc improvements 2025-05-16 18:03:12 +02:00
Nick Wellnhofer
cdce17c3cb html: Only map HTML encodings from meta tag 2025-05-12 21:21:25 +02:00
Nick Wellnhofer
f0983199e8 html: Map some encodings according to HTML5
Windows-1252 is a superset of ISO-8859-1 and should be used instead.
Same for ASCII.

Also map UCS-2 and UTF-16 to UTF-16LE.
2025-05-12 14:04:30 +02:00
Nick Wellnhofer
442c1903af doc: Fix some damage from automated conversions
Add some newlines, fix returns.
2025-05-11 20:29:25 +02:00
Nick Wellnhofer
38ea8fa9de doc: Fix varargs 2025-05-06 19:51:38 +02:00
Nick Wellnhofer
9bbffec568 doc: Move brief to top, params to bottom of doc comments 2025-05-06 19:51:38 +02:00
Nick Wellnhofer
ab13fbfd68 doc: Misc fixes to error docs 2025-05-06 19:51:38 +02:00
Nick Wellnhofer
1bf44f09ba doc: Misc fixes to parser docs 2025-05-06 19:51:38 +02:00
Nick Wellnhofer
cb1635a642 doc: Use @since command 2025-05-02 19:05:25 +02:00
Nick Wellnhofer
e78e05c990 doc: Fix autolinks to functions
Unfortunately, autolinks in .c files aren't converted by Doxygen for
some reason.
2025-05-02 17:45:31 +02:00
Nick Wellnhofer
1eca6e3476 parser: Deprecate xmlClearParserCtxt 2025-05-02 13:33:35 +02:00
Nick Wellnhofer
e525564f65 doc: Remove empty lines at start of block
These lines were left over after automatic conversion.
2025-05-02 11:42:05 +02:00
Nick Wellnhofer
e549622bc5 doc: Convert documentation to Doxygen
Automated conversion based on a few regexes.
2025-05-01 23:23:42 +02:00
Nick Wellnhofer
69879da88f doc: Remove email addresses from documentation
Also remove authorship information from generated files, hash.c and
globals.c which were rewritten.
2025-05-01 23:23:42 +02:00
Nick Wellnhofer
61890e399d doc: Prepare for conversion to Doxygen
Fix many params in internal functions (not really necessary but Doxygen
warns about that in XML mode).

Fix formatting in a few corner cases that automatic conversion can't
handle.

Rearrange some DOC_DISABLE blocks.
2025-05-01 23:23:42 +02:00
Nick Wellnhofer
fc8899d47c parser: Make xmlCtxtGetValidCtxt depend on VALID_ENABLED 2025-04-27 13:01:42 +02:00
Nick Wellnhofer
b85d77d156 http: Remove built-in HTTP client
Stubs are retained for ABI compatibility.

Fixes #631.
Obsoletes #160.
2025-04-20 18:21:06 +02:00
Nick Wellnhofer
9e3159d04c parser: Never use XML catalogs when parsing HTML files
When loading HTML files we shouldn't try to resolve URIs using the XML
catalogs.
2025-04-19 14:52:14 +02:00
Nick Wellnhofer
b349225952 include: Change some return types from int to enum
This also affects some new functions from 2.13.
2025-03-14 02:31:01 +01:00
Nick Wellnhofer
fd1b939168 include: Convert some macros to enums 2025-03-14 00:35:40 +01:00
Nick Wellnhofer
84c6524e26 encoding: Support input-only and output-only converters
Make it possible to open an encoding handler only for input or output.
This avoids the creation of unnecessary converters.

Should also fix #863.
2025-03-13 22:15:10 +01:00
Nick Wellnhofer
69b83bb68e encoding: Detect truncated multi-byte sequences with ICU
Unlike iconv or the internal converters, ICU consumes truncated multi-
byte sequences at the end of an input buffer. We currently check for a
non-empty raw input buffer to detect truncated sequences, so this fails
with ICU.

It might be possible to inspect the pivot buffer pointers, but it seems
cleaner to implement a `flush` flag for some encoding and I/O functions.
After flushing, we can check for U_TRUNCATED_CHAR_FOUND with ICU, or
detect remaining input with other converters.

Also fix detection of truncated sequences for HTML, XML content and
DTDs with iconv.
2025-03-13 22:15:10 +01:00
Nick Wellnhofer
25490528af parser: Fix spurious error in SAX mode
Short-lived regression from 5f0b1378.
2025-03-11 16:34:30 +01:00
Nick Wellnhofer
5f0b1378d7 parser: Add more parser context accessors
Fixes #763.
2025-03-08 22:36:06 +01:00
Nick Wellnhofer
6bb2ea8e70 html: Adjust xmlDetectEncoding for HTML
Don't check for UTF-32 or EBCDIC.

We now perform BOM sniffing and the first step of the HTML5 prescan
algorithm (detect UTF-16 XML declarations). The rest of the algorithm
still has to be implemented.
2025-02-02 11:15:44 +01:00
Nick Wellnhofer
0de90f518d parser: Define SIZE_MAX 2025-01-30 01:25:31 +01:00
Nick Wellnhofer
3eced32ea3 parser: Fix push parser with encoding and single chunk
When push-parsing with an encoding handler, we must convert the whole
buffer in the initial conversion. Otherwise, parsing a single chunk
larger than ~4KB would fail.

Regressed with commit 34c9108f.
2025-01-30 00:02:34 +01:00
Nick Wellnhofer
1082d813e8 parser: Prepare to make decompression opt-in
Add a new parser option XML_PARSE_UNZIP that enables decompression.
xmlReadFile, xmlCtxtReadFile and xmlCreateURLParserCtxt always set
this option currently, but downstream users should start to set the
option if they really need it.
2025-01-29 00:49:57 +01:00
Nick Wellnhofer
a78843be5e xmllint: Support compressed input from stdin
Another regression related to reading from stdin.

Making a "-" filename read from stdin was deeply baked into the core
IO code but is inherently insecure. I really want to reenable this
dangerous feature as sparingly as possible.

This now enables compressed input when using the "Fd" API functions
which wan't supported before. But XML_PARSE_NO_UNZIP will be
inverted later.

Allow compressed stdin in xmlReadFile to support xmlstarlet and older
versions of xsltproc. So far, these are the only known command-line
tools that rely on "-" meaning stdin.
2025-01-28 23:20:37 +01:00
Nick Wellnhofer
2e3a91a766 doc: Fix documentation 2024-12-26 21:05:39 +01:00
Nick Wellnhofer
8231c03663 parser: Check reallocations for overflow 2024-12-21 19:37:37 +01:00
Nick Wellnhofer
0dd910e82b save: Fix handling of catastrophic errors
Don't overwrite catastrophic errors xmlSaveErr.

Overwrite non-catastrophic errors in xmlOutputBufferClose.
2024-12-19 02:30:36 +01:00
Nick Wellnhofer
1e1b48918c parser: Also raise error if ctxt is NULL
Update global error variable even if context is missing because of an
invalid (NULL) argument.
2024-12-13 17:57:11 +01:00
Nick Wellnhofer
70cce2ece3 parser: Make XML_ERR_RESOURCE_LIMIT non-catastrophic 2024-11-26 14:20:25 +01:00
Nick Wellnhofer
57087e5fc7 parser: Don't overwrite catastrophic errors
Stop reporting errors after a catastrophic error.

Also make sure that ctxt->errNo matches ctxt->lastError.code.
2024-11-26 00:47:48 +01:00
Nick Wellnhofer
0f4f89005d parser: Rename inputPush to xmlCtxtPushInput 2024-11-19 00:25:23 +01:00
Nick Wellnhofer
e2ad249c23 parser: Deprecate more internal symbols
- xmlParseExternalSubset
- xmlPushInput
- xmlPopInput
- xmlCopyCharMultiByte
- xmlCreateEntityParserCtxt
- xmlStringComment
2024-11-19 00:25:23 +01:00
Nick Wellnhofer
bd9eed4694 parser: Make unsupported encodings an error in declarations
This was changed in 45157261, but in encoding declarations, unsupported
encodings should raise a fatal error.

Fixes #794.
2024-09-02 19:29:39 +02:00
Nick Wellnhofer
1d009fe35d parser: Report at least one fatal error 2024-08-05 15:14:21 +02:00
Nick Wellnhofer
bfed6e6ae8 parser: Fix error handling after reaching limit
Mark document as non-wellformed and stop parser even if error limit was
reached.

Regressed in abd74186.
2024-08-05 14:58:37 +02:00
Nick Wellnhofer
6a3c0b0d93 parser: Increase XML_MAX_DICTIONARY_LIMIT
This limit is somewhat arbitrary and can be reached when fuzzing
documents up to 1 MB.

Increase limit to 100 MB and disable limit if XML_PARSE_HUGE is set.
2024-07-22 12:53:00 +02:00
Nick Wellnhofer
a6f54f055b io: Fine-tune initial IO buffer size 2024-07-16 17:42:10 +02:00
Nick Wellnhofer
34c9108f15 encoding: Add sizeOut argument to xmlCharEncInput
When push parsing, we want to convert as much of the input as possible.
When pull parsing memory buffers, we want to convert data chunk by chunk
to save memory.
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
92f30711de parser: Optimize buffer shrinking
Remove checks now that we can shrink memory buffers efficiently.

Shrink more aggressively.
2024-07-16 17:42:10 +02:00