Nick Wellnhofer
ca81916023
include: Use intptr_t to cast between pointers and ints
2025-01-03 20:59:10 +01:00
Nick Wellnhofer
2e3a91a766
doc: Fix documentation
2024-12-26 21:05:39 +01:00
Nick Wellnhofer
8231c03663
parser: Check reallocations for overflow
2024-12-21 19:37:37 +01:00
Nick Wellnhofer
6548ba11b8
parser: Fix argument checks in xmlCtxtParse*
...
- Raise invalid argument error.
- Free input stream if ctxt is NULL.
2024-12-13 17:57:11 +01:00
Nick Wellnhofer
eae9a1bd8b
parser: Pop input stream in xmlCtxtValidateDtd
2024-11-26 14:30:54 +01:00
Nick Wellnhofer
dafcefb228
parser: Fail on catastrophic errors in recovery mode
2024-11-26 00:47:48 +01:00
Nick Wellnhofer
0dc26910c1
parser: Deprecate more internal functions
2024-11-21 22:31:20 +01:00
Nick Wellnhofer
84a6eece62
parser: Remove unneeded call to xmlDetectEncoding
2024-11-19 00:25:23 +01:00
Nick Wellnhofer
497081baab
parser: Remove remaining calls to xml{Push|Pop}Input
2024-11-19 00:25:23 +01:00
Nick Wellnhofer
0f4f89005d
parser: Rename inputPush to xmlCtxtPushInput
2024-11-19 00:25:23 +01:00
Nick Wellnhofer
e2ad249c23
parser: Deprecate more internal symbols
...
- xmlParseExternalSubset
- xmlPushInput
- xmlPopInput
- xmlCopyCharMultiByte
- xmlCreateEntityParserCtxt
- xmlStringComment
2024-11-19 00:25:23 +01:00
Nick Wellnhofer
631778f679
parser: Check for malloc failure in xmlCtxtParseDtd
2024-11-17 12:11:41 +01:00
Nick Wellnhofer
7f8c436c75
parser: Implement xmlCtxtParseDtd and xmlCtxtValidateDtd
...
This allows to use the context's error handler, options and other
settings.
Fixes #808 .
2024-11-15 16:30:52 +01:00
Ruslan Garipov
aaecdc92e2
parser: Assign value without if-statement
...
This avoids an if-statement, because effectively it does nothing. And,
for example, binary artifact generated by GCC with -O2 optimization
settings does not contain that if-statement -- the code just uses the
hprefix->name field explicitly.
No functional changes intended.
Signed-off-by: Ruslan Garipov <ruslanngaripov@gmail.com >
2024-11-12 16:42:36 +05:00
Nick Wellnhofer
869e3fd421
parser: Fix loading of parameter entities in external DTDs
...
Regressed with commit 12f0bb94 .
Fixes #816 .
2024-11-01 16:53:18 +01:00
Nick Wellnhofer
efb57ddba3
parser: Fix downstream code that swaps DTDs
...
Downstream code like the nginx xslt module can change the document's DTD
pointers in a SAX callback. If an entity from a separate DTD is parsed
lazily, its content must not reference the current document.
Regressed with commit d025cfbb .
Fixes #815 .
2024-10-30 14:13:38 +01:00
Nick Wellnhofer
0ec5687e06
parser: Rework xmlCtxtGrowAttrs
...
Remove unneeded argument.
Check for integer overflow. We probably hit the buffer size limit in
xmlParserGrow before, but better be safe.
2024-10-28 21:06:52 +01:00
Nick Wellnhofer
ffb058f484
parser: Fix detection of duplicate attributes
...
We really need a second scan if more than one namespace clash was
detected.
2024-10-28 20:26:55 +01:00
Nick Wellnhofer
b52a3044aa
parser: Use counted_by attribute if supported
...
We only have a single struct with a flexible array member.
2024-10-24 18:18:47 +02:00
Nick Wellnhofer
74dfc49b5f
parser: Clarify logic in xmlParseStartTag2
2024-10-06 20:04:00 +02:00
Nick Wellnhofer
0bc4608c50
html: Use hash table to check for duplicate attributes
2024-10-06 20:04:00 +02:00
Nick Wellnhofer
0ce7bfe559
html: Try to avoid passing XML options to HTML parser
2024-10-06 20:04:00 +02:00
Nick Wellnhofer
16de1346eb
parser: Make new options actually work
2024-10-06 20:04:00 +02:00
Nick Wellnhofer
dde62ae5d5
parser: Align push parsing of CDATA sections with pull parser
...
Remove special handling of CDATA sections in push parser. This makes
sure that only a single callback is generated for large sections.
Fixes #22 and needed for #412 .
2024-08-29 01:28:49 +02:00
Nick Wellnhofer
4d10e53af1
parser: Make sure to set and increment input id
...
Revert part of commits 410931e3 and b9d2f3c9 .
2024-08-28 22:47:20 +02:00
Nick Wellnhofer
6d365ca02c
doc: XML_PARSE_NO_XXE is available since 2.13.0
2024-08-28 22:09:30 +02:00
makise-homura
103aadbc66
parser: Suppress EDG maybe-uninitialized warning
2024-08-16 22:26:07 +03:00
Nick Wellnhofer
02fcb1effb
parser: Make xmlParseChunk return an error if parser was stopped
...
This regressed after enhancing the disableSAX member in 2.13.
Should fix #777 .
2024-07-25 17:07:18 +02:00
Nick Wellnhofer
1a89323039
[CVE-2024-40896] Fix XXE protection in downstream code
...
Some users set an entity's children manually in the getEntity SAX
callback to restrict entity expansion. This stopped working after
renaming the "checked" member of xmlEntity, making at least one
downstream project and its dependants susceptible to XXE attacks.
See #761 .
2024-07-24 17:19:32 +02:00
Nick Wellnhofer
6a3c0b0d93
parser: Increase XML_MAX_DICTIONARY_LIMIT
...
This limit is somewhat arbitrary and can be reached when fuzzing
documents up to 1 MB.
Increase limit to 100 MB and disable limit if XML_PARSE_HUGE is set.
2024-07-22 12:53:00 +02:00
Nick Wellnhofer
5d36664fc9
memory: Deprecate xmlGcMemSetup
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
7148b77820
parser: Optimize memory buffer I/O
...
Reenable zero-copy IO for zero-terminated static memory buffers.
Don't stream zero-terminated dynamic memory buffers on top of creating
a copy.
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
34c9108f15
encoding: Add sizeOut argument to xmlCharEncInput
...
When push parsing, we want to convert as much of the input as possible.
When pull parsing memory buffers, we want to convert data chunk by chunk
to save memory.
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
6be79014d7
Remove unused code
2024-07-15 16:33:38 +02:00
Nick Wellnhofer
fee0006a06
parser: Fix memory leak after malloc failure in xml*ParseDTD
2024-07-15 13:03:55 +02:00
Nick Wellnhofer
8af55c8d20
parser: Rename new input API functions
...
These weren't made public yet.
2024-07-11 01:33:29 +02:00
Nick Wellnhofer
d74ca59491
parser: Rename internal xmlNewInput functions
2024-07-11 01:31:50 +02:00
Nick Wellnhofer
4f329dc524
parser: Implement xmlCtxtParseContent
...
This implements xmlCtxtParseContent, a better alternative to
xmlParseInNodeContext or xmlParseBalancedChunkMemory. It accepts a
parser context and a parser input, making it a lot more versatile.
xmlParseInNodeContext is now implemented in terms of
xmlCtxtParseContent. This makes sure that xmlParseInNodeContext never
modifies the target document, improving thread safety.
xmlParseInNodeContext is also more lenient now with regard to undeclared
entities.
Fixes #727 .
2024-07-11 01:26:32 +02:00
Nick Wellnhofer
f51ad063a7
parser: Fix error return of xmlParseBalancedChunkMemory
...
Only return an error code if the chunk is not well-formed to match the
2.12 behavior. Return 0 on non-fatal errors like invalid namespaces.
Fixes #765 .
2024-07-08 11:28:33 +02:00
Nick Wellnhofer
2e63656ec6
parser: Check return value of inputPush
...
inputPush typically doesn't fail because we pre-allocate the input
table. The return value should be checked nevertheless.
2024-07-08 11:27:52 +02:00
Nick Wellnhofer
1e5375c1b4
SAX2: Check return value of xmlPushInput
...
Fix null deref in case of malloc failure.
2024-07-06 15:33:06 +02:00
Nick Wellnhofer
38195cf596
parser: Don't produce names with invalid UTF-8 in recovery mode
2024-07-06 15:33:06 +02:00
Nick Wellnhofer
fdfeecfe5e
parser: Reenable ctxt->directory
...
Unused internally, but used in downstream code.
Should fix #753 .
2024-07-02 22:06:53 +02:00
Nick Wellnhofer
606f410891
parser: Allow to disable catalogs with parser options
...
Implement XML_PARSE_NO_SYS_CATALOG and XML_PARSE_NO_CATALOG_PI.
Fixes #735 .
2024-07-02 22:06:53 +02:00
Nick Wellnhofer
866be54e22
parser: Don't use deprecated xmlSplitQName
2024-07-02 13:34:11 +02:00
Nick Wellnhofer
bc793390d5
parser: Update documentation
2024-06-27 16:23:14 +02:00
Nick Wellnhofer
eca972e682
parser: Add getters for XML declaration to parser context
...
Access to struct members will be deprecated.
2024-06-27 14:44:49 +02:00
Mike Dalessio
bbbbbb4649
parser: implement xmlCtxtGetOptions
...
In 712a31ab , the `options` struct member was deprecated. To allow
callers to check the status of options bits, introduce
xmlCtxtGetOptions.
2024-06-20 20:39:54 +00:00
Rosen Penev
217e9b7af2
clang-tidy: don't return in void functions
...
Found with readability-redundant-control-flow
Signed-off-by: Rosen Penev <rosenp@gmail.com >
2024-06-20 20:37:34 +00:00
Nick Wellnhofer
32cac377c8
parser: Selectively reenable reading from "-"
...
Make filename "-" mean stdin for legacy SAX1 functions and xmlReadFile.
This should hopefully fix most command line utilities.
See #737 .
2024-06-17 18:08:31 +02:00