1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2026-01-26 21:41:34 +03:00

7744 Commits

Author SHA1 Message Date
Nick Wellnhofer
2f3655c9c3 parser: Pop PEs that start markup declarations explicitly
We currently only handle "Validity constraint: Proper Declaration/PE
Nesting", but we must detect "Well-formedness constraint: PE Between
Declarations" separately:

> The replacement text of a parameter entity reference in a DeclSep must
> match the production extSubsetDecl.

PEs in DeclSeps are PEs that start with a full markup declaration (or
another PE). These are handled in xmParse{Internal|External}Subset. We
set a flag on these PEs and don't close them implicitly in
xmlSkipBlankCharsPE. This will make unterminated declarations in such
PEs cause a parser error. The PEs are closed explicitly in
xmParse{Internal|External}Subset, the only location where they are
allowed to end.
2025-05-25 14:26:30 +02:00
Nick Wellnhofer
2a60ca06c0 valid: Don't check enum values
Rely on the parser to pass valid arguments.
2025-05-25 14:26:30 +02:00
Nick Wellnhofer
4aa7192f21 tests: Add dtor for xmlElementContent in testapi.c 2025-05-25 14:26:30 +02:00
Nick Wellnhofer
fc1cabc822 valid: Also raise duplicate ID error without validation support
Whether an error is raised should not depend on config options.
2025-05-25 14:26:30 +02:00
Nick Wellnhofer
dd1961e0d8 valid: Skip more validity checks if not validating 2025-05-25 14:26:30 +02:00
Nick Wellnhofer
6c2bd9758f valid: Don't validate unused default attributes
See erratum E9 of XML 1.0 Second Edition.

See #120.
2025-05-25 14:26:30 +02:00
Nick Wellnhofer
fca0860d6c tree: Deprecate public struct members related to DTDs
Let's deprecate these members for now. If these are really used, they
can be undeprecated later.
2025-05-25 14:26:30 +02:00
Dag-Erling Smørgrav
3ab040c203 Fix unidiomatic use of vsnprintf().
* Don't terminate an already-terminated buffer.
* Consistently use 1024-byte buffers.
* While here, consistently use ap for a va_list.
2025-05-24 01:28:49 +02:00
Dag-Erling Smørgrav
8ea253b895 Remove bogus casts.
* Casting a string literal to `char *` and then immediately passing or
  assigning the result to a `const char *` makes no sense.
* There is no need to cast `int` to `Py_ssize_t` as they have the same
  sign and the latter is at least as wide as the former.
2025-05-24 01:28:21 +02:00
Nick Wellnhofer
7c9b55356d doc: Document unused error domains 2025-05-19 20:07:54 +02:00
Nick Wellnhofer
47aca2c6c9 parser: Only check validity contraints when validating 2025-05-19 20:07:54 +02:00
Nick Wellnhofer
3a68d0b7a8 SAX2: Handle xml:id errors separately 2025-05-19 20:07:54 +02:00
Nick Wellnhofer
172550d225 parser: Only validate EnumerationTypes when requested
This has quadratic behavior and is only a validity constraint.
2025-05-19 19:58:33 +02:00
Nick Wellnhofer
7008740a96 parser: Consolidate scanning of XML Names
Use new productions by default.

Fixes #194.
Fixes #364.
See #707.
2025-05-19 19:58:33 +02:00
Nick Wellnhofer
657254a87f parser: Factor out xmlIsNameCharNew/Old 2025-05-18 01:23:25 +02:00
Nick Wellnhofer
315bd443c5 meson: Switch to cfg_data.set10() 2025-05-17 18:59:52 +02:00
Nick Wellnhofer
4e5945fc3c cmake: Avoid overlinking with non-CMake libxml2-config.cmake
Align libxml2-config.cmake generated by Autotools and Meson with the
CMake version and only add dependencies to libraries when linking
statically. Also set LIBXML_STATIC for static builds.

Fixes #918.
2025-05-17 18:52:36 +02:00
Nick Wellnhofer
faaa01b8c1 cmake: Make iconv a private dependency
This was only needed for the headers before 2.14.
2025-05-17 18:52:11 +02:00
Nick Wellnhofer
70e5d664ea doc: Don't document deprecated headers 2025-05-17 01:31:55 +02:00
Nick Wellnhofer
7c82391c64 codegen: Factor out code to generate range tables 2025-05-17 01:29:37 +02:00
Nick Wellnhofer
502c5f658a meson: Dependency on directory doesn't work 2025-05-17 00:11:03 +02:00
Nick Wellnhofer
210f5a3746 chvalid: Mark functions as deprecated 2025-05-16 23:27:51 +02:00
Nick Wellnhofer
954aae907d doc: Improve regexp documentation 2025-05-16 21:13:17 +02:00
Nick Wellnhofer
cbad60ff81 xmllint: Remove unused macros 2025-05-16 19:04:20 +02:00
Nick Wellnhofer
2132150d08 xmllint: Switch to xmlCtxtGetDocument 2025-05-16 19:04:20 +02:00
Nick Wellnhofer
c5b45fbc07 doc: Misc fixes 2025-05-16 19:04:20 +02:00
Nick Wellnhofer
c4926b19d3 codegen: Merge xmlunicode.c into xmlregexp.c
Include generated parts.

Generate xmlChRangeGroups instead of functions for Unicode blocks.
2025-05-16 19:04:20 +02:00
Nick Wellnhofer
4cb767e96e codegen: Only generate tables for character ranges
The rest can be easily maintained manually.
2025-05-16 19:04:20 +02:00
Nick Wellnhofer
770c6decd8 buf: Remove ABI compatibility hack
I think this was required when some struct members like
xmlParserInputBuffer::buffer were changed from xmlBuffer to xmlBuf (20+
years ago).

Unfortunately, I missed the opportunity to align xmlBuffer with xmlBuf
before the ABI break.
2025-05-16 18:03:12 +02:00
Nick Wellnhofer
344190dbf6 doc: Document deprecated xmlThrDef* functions 2025-05-16 18:03:12 +02:00
Nick Wellnhofer
6f4b452742 parser: Stop using ctxt->linenumbers
I think this was used to avoid setting the `line` member before it was
added (20+ years ago).
2025-05-16 18:03:12 +02:00
Nick Wellnhofer
5ce48ec131 SAX2: Rework xmlSAX2Text
Simplify and make more readable.
2025-05-16 18:03:12 +02:00
Nick Wellnhofer
d834437b59 python: Add deprecation warning 2025-05-16 18:03:12 +02:00
Nick Wellnhofer
a05fa9a905 codegen: Rerun codegen scripts 2025-05-16 18:03:12 +02:00
Nick Wellnhofer
258d870629 codegen: Consolidate tools for code generation
Move tools, source files and output tables into codegen directory.

Rename some files.

Adjust tools to match modified files. Remove generation date and source
files from output.

Distribute all tools and sources.
2025-05-16 18:03:12 +02:00
Nick Wellnhofer
0d34d690c4 README: Update configuration options
Python is disabled by default now. Mention --prefix.
2025-05-16 18:03:12 +02:00
Nick Wellnhofer
adfbeb7e08 doc: Stop using *Ptr typedefs in documentation 2025-05-16 18:03:12 +02:00
Nick Wellnhofer
a40f36e7f2 include: Stop using *Ptr typedefs in public headers 2025-05-16 18:03:12 +02:00
Nick Wellnhofer
0da20b834f autotools: Quote filenames in doc/Makefile.am 2025-05-16 18:03:12 +02:00
Nick Wellnhofer
2d83a84ca6 doc: Misc improvements 2025-05-16 18:03:12 +02:00
Nick Wellnhofer
87087def4e tests: Remove result files committed by accident 2025-05-13 23:00:51 +02:00
Nick Wellnhofer
d6151c2337 libxml2.doap: Remove inactive maintainer 2025-05-13 23:00:51 +02:00
Nick Wellnhofer
af4fae5ae3 html: Add some comments regarding HTML5 serialization
It seems that the specification of the HTML output method in XSLT 1.0
had a lot of influence on how the HTML serializer in libxml2 ended up:

https://www.w3.org/TR/xslt-10/#section-HTML-Output-Method

There are two remaining behaviors suggested by XSLT 1.0 that don't match
the HTML5 fragment serialization algorithm:

We escape non-ASCII characters in URI attributes (the list of which is
probably outdated). This was originally recommended in appendix B of the
HTML 4.01 spec, but only for user agents:

https://www.w3.org/TR/html401/appendix/notes.html#h-B.2.1

From my experience, any tool that processes HTML should escape as little
as possible. For example, we used to escape many more characters which
are invalid in URIs, but often used in template languages. (Note that we
still escape whitespace and control chars.) Nevertheless, I guess that
some libxslt users continue to expect this behavior from libxml2.

Then we collapse Boolean attributes using an outdated list. This is
mostly a cosmetic issue, but a somewhat important one for libxslt users.

We probably need a serialization option for the xmlsave module that
enables fully HTML5-conformant output.
2025-05-13 23:00:51 +02:00
Nick Wellnhofer
b0234633e7 encoding: Preserve original encoding label
When using built-in encodings, the label would be normalized which
causes various issues. We now create a copy of the handler with the
original name.

This is somewhat dangerous as it will require users to free built-in
encodings with xmlCharEncCloseFunc. But to handle the general case, this
was already required.

Fixes #916 in another way than originally proposed.
2025-05-13 22:53:02 +02:00
Nick Wellnhofer
fcb7a777ce io: Make xmlOutputBufferCreate* not free encoder on error
Revert a530ff12 which was an inadvertent API change.
2025-05-13 22:44:42 +02:00
Nick Wellnhofer
5b71dca613 Fix -Wunterminated-string-initialization warnings
Don't use strings for table.
2025-05-12 21:58:06 +02:00
Nick Wellnhofer
cdce17c3cb html: Only map HTML encodings from meta tag 2025-05-12 21:21:25 +02:00
Nick Wellnhofer
19b9931184 encoding: Fix -Wswitch warning 2025-05-12 21:07:41 +02:00
Nick Wellnhofer
39ae5d1265 save: Add NULL check in xmlBufDumpEntityContent
Short-lived regression.
2025-05-12 21:04:41 +02:00
Nick Wellnhofer
c2929b5dd3 html: Ignore namespaces when handling meta tags
Revert to old behavior to fix issues with XHTML documents.
2025-05-12 21:01:35 +02:00