1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2026-01-26 21:41:34 +03:00
Commit Graph

7528 Commits

Author SHA1 Message Date
Nick Wellnhofer
4e5945fc3c cmake: Avoid overlinking with non-CMake libxml2-config.cmake
Align libxml2-config.cmake generated by Autotools and Meson with the
CMake version and only add dependencies to libraries when linking
statically. Also set LIBXML_STATIC for static builds.

Fixes #918.
2025-05-17 18:52:36 +02:00
Nick Wellnhofer
faaa01b8c1 cmake: Make iconv a private dependency
This was only needed for the headers before 2.14.
2025-05-17 18:52:11 +02:00
Nick Wellnhofer
70e5d664ea doc: Don't document deprecated headers 2025-05-17 01:31:55 +02:00
Nick Wellnhofer
7c82391c64 codegen: Factor out code to generate range tables 2025-05-17 01:29:37 +02:00
Nick Wellnhofer
502c5f658a meson: Dependency on directory doesn't work 2025-05-17 00:11:03 +02:00
Nick Wellnhofer
210f5a3746 chvalid: Mark functions as deprecated 2025-05-16 23:27:51 +02:00
Nick Wellnhofer
954aae907d doc: Improve regexp documentation 2025-05-16 21:13:17 +02:00
Nick Wellnhofer
cbad60ff81 xmllint: Remove unused macros 2025-05-16 19:04:20 +02:00
Nick Wellnhofer
2132150d08 xmllint: Switch to xmlCtxtGetDocument 2025-05-16 19:04:20 +02:00
Nick Wellnhofer
c5b45fbc07 doc: Misc fixes 2025-05-16 19:04:20 +02:00
Nick Wellnhofer
c4926b19d3 codegen: Merge xmlunicode.c into xmlregexp.c
Include generated parts.

Generate xmlChRangeGroups instead of functions for Unicode blocks.
2025-05-16 19:04:20 +02:00
Nick Wellnhofer
4cb767e96e codegen: Only generate tables for character ranges
The rest can be easily maintained manually.
2025-05-16 19:04:20 +02:00
Nick Wellnhofer
770c6decd8 buf: Remove ABI compatibility hack
I think this was required when some struct members like
xmlParserInputBuffer::buffer were changed from xmlBuffer to xmlBuf (20+
years ago).

Unfortunately, I missed the opportunity to align xmlBuffer with xmlBuf
before the ABI break.
2025-05-16 18:03:12 +02:00
Nick Wellnhofer
344190dbf6 doc: Document deprecated xmlThrDef* functions 2025-05-16 18:03:12 +02:00
Nick Wellnhofer
6f4b452742 parser: Stop using ctxt->linenumbers
I think this was used to avoid setting the `line` member before it was
added (20+ years ago).
2025-05-16 18:03:12 +02:00
Nick Wellnhofer
5ce48ec131 SAX2: Rework xmlSAX2Text
Simplify and make more readable.
2025-05-16 18:03:12 +02:00
Nick Wellnhofer
d834437b59 python: Add deprecation warning 2025-05-16 18:03:12 +02:00
Nick Wellnhofer
a05fa9a905 codegen: Rerun codegen scripts 2025-05-16 18:03:12 +02:00
Nick Wellnhofer
258d870629 codegen: Consolidate tools for code generation
Move tools, source files and output tables into codegen directory.

Rename some files.

Adjust tools to match modified files. Remove generation date and source
files from output.

Distribute all tools and sources.
2025-05-16 18:03:12 +02:00
Nick Wellnhofer
0d34d690c4 README: Update configuration options
Python is disabled by default now. Mention --prefix.
2025-05-16 18:03:12 +02:00
Nick Wellnhofer
adfbeb7e08 doc: Stop using *Ptr typedefs in documentation 2025-05-16 18:03:12 +02:00
Nick Wellnhofer
a40f36e7f2 include: Stop using *Ptr typedefs in public headers 2025-05-16 18:03:12 +02:00
Nick Wellnhofer
0da20b834f autotools: Quote filenames in doc/Makefile.am 2025-05-16 18:03:12 +02:00
Nick Wellnhofer
2d83a84ca6 doc: Misc improvements 2025-05-16 18:03:12 +02:00
Nick Wellnhofer
87087def4e tests: Remove result files committed by accident 2025-05-13 23:00:51 +02:00
Nick Wellnhofer
d6151c2337 libxml2.doap: Remove inactive maintainer 2025-05-13 23:00:51 +02:00
Nick Wellnhofer
af4fae5ae3 html: Add some comments regarding HTML5 serialization
It seems that the specification of the HTML output method in XSLT 1.0
had a lot of influence on how the HTML serializer in libxml2 ended up:

https://www.w3.org/TR/xslt-10/#section-HTML-Output-Method

There are two remaining behaviors suggested by XSLT 1.0 that don't match
the HTML5 fragment serialization algorithm:

We escape non-ASCII characters in URI attributes (the list of which is
probably outdated). This was originally recommended in appendix B of the
HTML 4.01 spec, but only for user agents:

https://www.w3.org/TR/html401/appendix/notes.html#h-B.2.1

From my experience, any tool that processes HTML should escape as little
as possible. For example, we used to escape many more characters which
are invalid in URIs, but often used in template languages. (Note that we
still escape whitespace and control chars.) Nevertheless, I guess that
some libxslt users continue to expect this behavior from libxml2.

Then we collapse Boolean attributes using an outdated list. This is
mostly a cosmetic issue, but a somewhat important one for libxslt users.

We probably need a serialization option for the xmlsave module that
enables fully HTML5-conformant output.
2025-05-13 23:00:51 +02:00
Nick Wellnhofer
b0234633e7 encoding: Preserve original encoding label
When using built-in encodings, the label would be normalized which
causes various issues. We now create a copy of the handler with the
original name.

This is somewhat dangerous as it will require users to free built-in
encodings with xmlCharEncCloseFunc. But to handle the general case, this
was already required.

Fixes #916 in another way than originally proposed.
2025-05-13 22:53:02 +02:00
Nick Wellnhofer
fcb7a777ce io: Make xmlOutputBufferCreate* not free encoder on error
Revert a530ff12 which was an inadvertent API change.
2025-05-13 22:44:42 +02:00
Nick Wellnhofer
5b71dca613 Fix -Wunterminated-string-initialization warnings
Don't use strings for table.
2025-05-12 21:58:06 +02:00
Nick Wellnhofer
cdce17c3cb html: Only map HTML encodings from meta tag 2025-05-12 21:21:25 +02:00
Nick Wellnhofer
19b9931184 encoding: Fix -Wswitch warning 2025-05-12 21:07:41 +02:00
Nick Wellnhofer
39ae5d1265 save: Add NULL check in xmlBufDumpEntityContent
Short-lived regression.
2025-05-12 21:04:41 +02:00
Nick Wellnhofer
c2929b5dd3 html: Ignore namespaces when handling meta tags
Revert to old behavior to fix issues with XHTML documents.
2025-05-12 21:01:35 +02:00
Nick Wellnhofer
4df8d55742 io: Fix stack use after scope
Short-lived regression.
2025-05-12 17:31:14 +02:00
Nick Wellnhofer
f0983199e8 html: Map some encodings according to HTML5
Windows-1252 is a superset of ISO-8859-1 and should be used instead.
Same for ASCII.

Also map UCS-2 and UTF-16 to UTF-16LE.
2025-05-12 14:04:30 +02:00
Nick Wellnhofer
93f671064e encoding: Add HTML5 aliases 2025-05-12 13:27:29 +02:00
Nick Wellnhofer
628006f457 encoding: Add windows-1252
Fixes #915.
2025-05-12 13:27:22 +02:00
Nick Wellnhofer
a7016baea6 tools: Remove unnecessary data from iso8859x.inc 2025-05-12 13:14:21 +02:00
Nick Wellnhofer
c92374f1b8 tools: Recreate script to generate iso8859x.inc
The script to create these tables was never committed to version
control.
2025-05-12 13:14:21 +02:00
Nick Wellnhofer
f602c0c186 html: Rework serialization of meta encoding attributes
Don't allocate memory.
2025-05-12 00:05:02 +02:00
Nick Wellnhofer
7654c2efc0 html: Rework serialization of URIs
Don't allocate memory.
2025-05-12 00:04:00 +02:00
Nick Wellnhofer
bd777e4f42 html: Speed up htmlIsBooleanAttr
This is used when serializing.
2025-05-11 23:28:40 +02:00
Nick Wellnhofer
825f3a9d0c html: Always serialize attributes with double quotes
Align with HTML5.
2025-05-11 21:42:51 +02:00
Nick Wellnhofer
5c4cc456a4 html: Escape encoding in meta tags 2025-05-11 21:30:30 +02:00
Nick Wellnhofer
0674ccb7cb html: Stop omitting end tags when serializing
Align with HTML5.
2025-05-11 20:57:07 +02:00
Nick Wellnhofer
05b8fe0a06 html: Don't escape RAWTEXT and PLAINTEXT
Align with HTML5.
2025-05-11 20:57:07 +02:00
Nick Wellnhofer
809ded586b html: Add more empty elements
Add empty HTML5 elements <bgsound>, <keygen>, <source>, <track> and
<wbr>.

Make <embed> an empty element.
2025-05-11 20:46:50 +02:00
Nick Wellnhofer
5f8ebc8809 save: Avoid xmlOutputBufferWriteQuotedString
xmlOutputBufferWriteQuotedString should be reserved for things like
system IDs.
2025-05-11 20:29:25 +02:00
Nick Wellnhofer
0d81d6f811 html: Use xmlOutputBufferWrite if possible 2025-05-11 20:29:25 +02:00