1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2025-10-23 01:52:48 +03:00
Commit Graph

1109 Commits

Author SHA1 Message Date
Michael Mann
cf4f967266 Add XML_PARSE_SKIP_IDS to replace XML_SKIP_IDS
Mark loadset member as deprecated

Fixes #873
2025-06-22 08:03:34 -04:00
Nick Wellnhofer
a3992815b3 parser: Fix buffer overflow when parsing PublicIds
Regressed with 8231c0366 and 30665ae4.
2025-06-12 13:51:37 +02:00
Nick Wellnhofer
30665ae4d1 parser: Fix parsing of PublicIds and VersionNums
Regressed in 8231c0366.

Fixes #940.
2025-06-11 18:36:50 +02:00
Nick Wellnhofer
416da89d0b html: Make htmlCtxtReset call xmlCtxtReset
The two implementations shouldn't diverge.
2025-06-08 14:22:32 +02:00
Alex Richardson
7e4247b278 parser: use XML_INT_TO_PTR when storing integers as pointers
This fixes warnings when using a CHERI-aware toolchain.
2025-06-06 12:11:54 -07:00
Nick Wellnhofer
2b6b3945f2 Revert "SAX1: Align handling of default attributes with SAX2"
This reverts commit db65b2fc51.

This didn't check for duplicate default attributes.
2025-06-03 16:21:56 +02:00
Nick Wellnhofer
30375877d9 parser: Fix custom SAX parsers without cdataBlock handler
Use characters handler if cdataBlock handler is NULL.

Regressed with 57e4bbd8. Should fix #934.
2025-06-03 16:21:48 +02:00
Nick Wellnhofer
479f26f92f regexp: Remove unfinished reimplementation
This was never enabled.
2025-06-03 00:28:16 +02:00
Nick Wellnhofer
0f8543e11d parser: Fix error reporting in xmlSkipBlankCharsPEBalanced
Short-lived regression.
2025-06-02 14:19:01 +02:00
Nick Wellnhofer
6a6a46f017 doc: Fix autolink errors
Fix links, remove links to internal functions.
2025-05-28 16:02:41 +02:00
Nick Wellnhofer
7bd8d1d9cc doc: Prefix autolinks with '#'
Use `#func` instead of `func()` to ignore parameters and make all
autolinks work.
2025-05-28 16:01:52 +02:00
Nick Wellnhofer
8baa5de182 parser: Avoid integer overflow in xmlParseCharDataInternal
`nbchar` could overflow with larger than 2GB memory buffers which some
new APIs allow. This shouldn't affect memory safety.

Limit maximum amount of bytes passed to character callback to
XML_MAX_ITEMS (1e9).
2025-05-27 20:03:13 +02:00
Nick Wellnhofer
ab06bfa1f6 parser: Fix error return in xmlParseElementContentDecl
Avoid internal error later in xmlValidBuildAContentModel after
2a60ca06c.

Also avoids some unnecessary error messages.
2025-05-26 16:51:59 +02:00
Nick Wellnhofer
4dc44c83ab parser: Rework entity boundary check for element content
Only use depth of input stack. This makes the input ID unused
internally.
2025-05-25 14:26:30 +02:00
Nick Wellnhofer
74ea6b483c parser: Start using input depth for entity boundary check
Now that we make sure that PEs starting markup won't be popped
implicitly, it's enough to check that no new entities are on the stack
when checking boundaries.
2025-05-25 14:26:30 +02:00
Nick Wellnhofer
db65b2fc51 SAX1: Align handling of default attributes with SAX2
The SAX1 parser is legacy code, but it seems more maintainable to align
it with SAX2.
2025-05-25 14:26:30 +02:00
Nick Wellnhofer
e4cbc295fa parser: Check attribute normalization standalone constraint
To fully implement "VC: Standalone Document Declaration", we have to
check for normalization changes caused by non-CDATA attribute types
declared externally.

Fixes #119.
2025-05-25 14:26:30 +02:00
Nick Wellnhofer
682195c869 parser: Fix "Proper Declaration/PE Nesting" validity constraint
Now that we handle "WFC: PE Between Declarations" correctly, we can turn
"Proper Declaration/PE Nesting" from a WFC into VC as specified.

Fixes #118.
2025-05-25 14:26:30 +02:00
Nick Wellnhofer
2f3655c9c3 parser: Pop PEs that start markup declarations explicitly
We currently only handle "Validity constraint: Proper Declaration/PE
Nesting", but we must detect "Well-formedness constraint: PE Between
Declarations" separately:

> The replacement text of a parameter entity reference in a DeclSep must
> match the production extSubsetDecl.

PEs in DeclSeps are PEs that start with a full markup declaration (or
another PE). These are handled in xmParse{Internal|External}Subset. We
set a flag on these PEs and don't close them implicitly in
xmlSkipBlankCharsPE. This will make unterminated declarations in such
PEs cause a parser error. The PEs are closed explicitly in
xmParse{Internal|External}Subset, the only location where they are
allowed to end.
2025-05-25 14:26:30 +02:00
Nick Wellnhofer
2a60ca06c0 valid: Don't check enum values
Rely on the parser to pass valid arguments.
2025-05-25 14:26:30 +02:00
Nick Wellnhofer
dd1961e0d8 valid: Skip more validity checks if not validating 2025-05-25 14:26:30 +02:00
Nick Wellnhofer
47aca2c6c9 parser: Only check validity contraints when validating 2025-05-19 20:07:54 +02:00
Nick Wellnhofer
172550d225 parser: Only validate EnumerationTypes when requested
This has quadratic behavior and is only a validity constraint.
2025-05-19 19:58:33 +02:00
Nick Wellnhofer
7008740a96 parser: Consolidate scanning of XML Names
Use new productions by default.

Fixes #194.
Fixes #364.
See #707.
2025-05-19 19:58:33 +02:00
Nick Wellnhofer
657254a87f parser: Factor out xmlIsNameCharNew/Old 2025-05-18 01:23:25 +02:00
Nick Wellnhofer
c5b45fbc07 doc: Misc fixes 2025-05-16 19:04:20 +02:00
Nick Wellnhofer
6f4b452742 parser: Stop using ctxt->linenumbers
I think this was used to avoid setting the `line` member before it was
added (20+ years ago).
2025-05-16 18:03:12 +02:00
Nick Wellnhofer
adfbeb7e08 doc: Stop using *Ptr typedefs in documentation 2025-05-16 18:03:12 +02:00
Nick Wellnhofer
a40f36e7f2 include: Stop using *Ptr typedefs in public headers 2025-05-16 18:03:12 +02:00
Nick Wellnhofer
442c1903af doc: Fix some damage from automated conversions
Add some newlines, fix returns.
2025-05-11 20:29:25 +02:00
Nick Wellnhofer
ad390a5d14 parser: Set doc properties in endDocument SAX handler 2025-05-11 20:29:25 +02:00
Nick Wellnhofer
9bbffec568 doc: Move brief to top, params to bottom of doc comments 2025-05-06 19:51:38 +02:00
Nick Wellnhofer
1bf44f09ba doc: Misc fixes to parser docs 2025-05-06 19:51:38 +02:00
Nick Wellnhofer
4a01087585 doc: Move parser option docs to enum 2025-05-06 19:51:38 +02:00
Nick Wellnhofer
cb1635a642 doc: Use @since command 2025-05-02 19:05:25 +02:00
Nick Wellnhofer
e78e05c990 doc: Fix autolinks to functions
Unfortunately, autolinks in .c files aren't converted by Doxygen for
some reason.
2025-05-02 17:45:31 +02:00
Nick Wellnhofer
f7c412874b doc: Remove more comment block headers 2025-05-02 17:41:26 +02:00
Nick Wellnhofer
1eca6e3476 parser: Deprecate xmlClearParserCtxt 2025-05-02 13:33:35 +02:00
Nick Wellnhofer
e525564f65 doc: Remove empty lines at start of block
These lines were left over after automatic conversion.
2025-05-02 11:42:05 +02:00
Nick Wellnhofer
e549622bc5 doc: Convert documentation to Doxygen
Automated conversion based on a few regexes.
2025-05-01 23:23:42 +02:00
Nick Wellnhofer
69879da88f doc: Remove email addresses from documentation
Also remove authorship information from generated files, hash.c and
globals.c which were rewritten.
2025-05-01 23:23:42 +02:00
Nick Wellnhofer
61890e399d doc: Prepare for conversion to Doxygen
Fix many params in internal functions (not really necessary but Doxygen
warns about that in XML mode).

Fix formatting in a few corner cases that automatic conversion can't
handle.

Rearrange some DOC_DISABLE blocks.
2025-05-01 23:23:42 +02:00
Nick Wellnhofer
0bac84b1bd Add missing NULL checks to public API functions 2025-04-25 13:15:29 +02:00
Nick Wellnhofer
72906f161c parser: Make undeclared entities in XML content fatal
When parsing XML content with functions like xmlParseBalancedChunk or
xmlParseInNodeContext, make undeclared entities always a fatal error to
match 2.13 behavior.

This was deliberately changed in 4f329dc5, probably to make the tests
pass.

Should fix #895.
2025-04-25 13:15:29 +02:00
Nick Wellnhofer
b85d77d156 http: Remove built-in HTTP client
Stubs are retained for ABI compatibility.

Fixes #631.
Obsoletes #160.
2025-04-20 18:21:06 +02:00
Nick Wellnhofer
a5c4a6efe7 parser: Fix XML_PARSE_NOBLANKS dropping non-whitespace text
Regressed with 1f5b5371.

Fixes #884.
2025-03-28 16:52:34 +01:00
Nick Wellnhofer
69b83bb68e encoding: Detect truncated multi-byte sequences with ICU
Unlike iconv or the internal converters, ICU consumes truncated multi-
byte sequences at the end of an input buffer. We currently check for a
non-empty raw input buffer to detect truncated sequences, so this fails
with ICU.

It might be possible to inspect the pivot buffer pointers, but it seems
cleaner to implement a `flush` flag for some encoding and I/O functions.
After flushing, we can check for U_TRUNCATED_CHAR_FOUND with ICU, or
detect remaining input with other converters.

Also fix detection of truncated sequences for HTML, XML content and
DTDs with iconv.
2025-03-13 22:15:10 +01:00
Nick Wellnhofer
8696ebe182 parser: Fix ignorableWhitespace callback
If ignorableWhitespace differs from the "characters" callback, we have
to check for blanks as well.

Regressed with 1f5b537.
2025-03-11 16:34:30 +01:00
Nick Wellnhofer
25490528af parser: Fix spurious error in SAX mode
Short-lived regression from 5f0b1378.
2025-03-11 16:34:30 +01:00
Nick Wellnhofer
5f0b1378d7 parser: Add more parser context accessors
Fixes #763.
2025-03-08 22:36:06 +01:00