1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2025-10-24 13:33:01 +03:00
Commit Graph

361 Commits

Author SHA1 Message Date
Nick Wellnhofer
258d870629 codegen: Consolidate tools for code generation
Move tools, source files and output tables into codegen directory.

Rename some files.

Adjust tools to match modified files. Remove generation date and source
files from output.

Distribute all tools and sources.
2025-05-16 18:03:12 +02:00
Nick Wellnhofer
adfbeb7e08 doc: Stop using *Ptr typedefs in documentation 2025-05-16 18:03:12 +02:00
Nick Wellnhofer
a40f36e7f2 include: Stop using *Ptr typedefs in public headers 2025-05-16 18:03:12 +02:00
Nick Wellnhofer
2d83a84ca6 doc: Misc improvements 2025-05-16 18:03:12 +02:00
Nick Wellnhofer
fcb7a777ce io: Make xmlOutputBufferCreate* not free encoder on error
Revert a530ff12 which was an inadvertent API change.
2025-05-13 22:44:42 +02:00
Nick Wellnhofer
4df8d55742 io: Fix stack use after scope
Short-lived regression.
2025-05-12 17:31:14 +02:00
Nick Wellnhofer
f602c0c186 html: Rework serialization of meta encoding attributes
Don't allocate memory.
2025-05-12 00:05:02 +02:00
Nick Wellnhofer
825f3a9d0c html: Always serialize attributes with double quotes
Align with HTML5.
2025-05-11 21:42:51 +02:00
Nick Wellnhofer
5f8ebc8809 save: Avoid xmlOutputBufferWriteQuotedString
xmlOutputBufferWriteQuotedString should be reserved for things like
system IDs.
2025-05-11 20:29:25 +02:00
Nick Wellnhofer
777e2adf77 io: Consolidate escaping code
Use generated table approach of xmlSerializeText for xmlEscapeText.

Move most code to xmlIO.c.
2025-05-11 20:29:25 +02:00
Nick Wellnhofer
dad1163078 entities: Always replace invalid chars when escaping
The previous refactor painstakingly recreated the different behavior of
separate functions that were merged. It makes

Optimize IS_CHAR check for non-ASCII chars.
2025-05-11 20:29:25 +02:00
Nick Wellnhofer
442c1903af doc: Fix some damage from automated conversions
Add some newlines, fix returns.
2025-05-11 20:29:25 +02:00
Nick Wellnhofer
a1e83b2401 io: Fix negation of potentially unsigned value 2025-05-11 20:29:25 +02:00
Nick Wellnhofer
9bbffec568 doc: Move brief to top, params to bottom of doc comments 2025-05-06 19:51:38 +02:00
Nick Wellnhofer
f38f3e7b25 doc: Misc fixes to IO documentation 2025-05-06 19:51:38 +02:00
Nick Wellnhofer
e78e05c990 doc: Fix autolinks to functions
Unfortunately, autolinks in .c files aren't converted by Doxygen for
some reason.
2025-05-02 17:45:31 +02:00
Nick Wellnhofer
f7c412874b doc: Remove more comment block headers 2025-05-02 17:41:26 +02:00
Nick Wellnhofer
e525564f65 doc: Remove empty lines at start of block
These lines were left over after automatic conversion.
2025-05-02 11:42:05 +02:00
Nick Wellnhofer
e549622bc5 doc: Convert documentation to Doxygen
Automated conversion based on a few regexes.
2025-05-01 23:23:42 +02:00
Nick Wellnhofer
69879da88f doc: Remove email addresses from documentation
Also remove authorship information from generated files, hash.c and
globals.c which were rewritten.
2025-05-01 23:23:42 +02:00
Nick Wellnhofer
61890e399d doc: Prepare for conversion to Doxygen
Fix many params in internal functions (not really necessary but Doxygen
warns about that in XML mode).

Fix formatting in a few corner cases that automatic conversion can't
handle.

Rearrange some DOC_DISABLE blocks.
2025-05-01 23:23:42 +02:00
Nick Wellnhofer
b85d77d156 http: Remove built-in HTTP client
Stubs are retained for ABI compatibility.

Fixes #631.
Obsoletes #160.
2025-04-20 18:21:06 +02:00
Nick Wellnhofer
2c2578b6fe io: Use switch statement in xmlIOErr 2025-03-31 13:10:00 +02:00
Collin Funk
fa539305fa io: Remove duplicated conditionals. 2025-03-31 11:05:27 +00:00
Nick Wellnhofer
b349225952 include: Change some return types from int to enum
This also affects some new functions from 2.13.
2025-03-14 02:31:01 +01:00
Nick Wellnhofer
fd1b939168 include: Convert some macros to enums 2025-03-14 00:35:40 +01:00
Nick Wellnhofer
69b83bb68e encoding: Detect truncated multi-byte sequences with ICU
Unlike iconv or the internal converters, ICU consumes truncated multi-
byte sequences at the end of an input buffer. We currently check for a
non-empty raw input buffer to detect truncated sequences, so this fails
with ICU.

It might be possible to inspect the pivot buffer pointers, but it seems
cleaner to implement a `flush` flag for some encoding and I/O functions.
After flushing, we can check for U_TRUNCATED_CHAR_FOUND with ICU, or
detect remaining input with other converters.

Also fix detection of truncated sequences for HTML, XML content and
DTDs with iconv.
2025-03-13 22:15:10 +01:00
Nick Wellnhofer
d96911f100 doc: Documentation fixes 2025-03-08 23:03:26 +01:00
Nick Wellnhofer
a0f156fffb io: Fix compressed flag for uncompressed stdin
This could cause xmlstarlet to generate compressed output unexpectedly.

Regressed with a78843be. Should fix #869.
2025-03-02 13:22:56 +01:00
Nick Wellnhofer
a78843be5e xmllint: Support compressed input from stdin
Another regression related to reading from stdin.

Making a "-" filename read from stdin was deeply baked into the core
IO code but is inherently insecure. I really want to reenable this
dangerous feature as sparingly as possible.

This now enables compressed input when using the "Fd" API functions
which wan't supported before. But XML_PARSE_NO_UNZIP will be
inverted later.

Allow compressed stdin in xmlReadFile to support xmlstarlet and older
versions of xsltproc. So far, these are the only known command-line
tools that rely on "-" meaning stdin.
2025-01-28 23:20:37 +01:00
Nick Wellnhofer
1c82bca6bd xmllint: Improve error reports from reader 2025-01-17 23:29:30 +01:00
Nick Wellnhofer
41c10c0cec io: Don't cast file descriptors to pointers
This doesn't work if open() returns 0 which is rare but can happen. Wrap
the fd in a context struct.

Fixes #835.
2025-01-03 20:15:52 +01:00
Nick Wellnhofer
b3871dd138 io: Fix memory leaks of encoding handler in error cases
xmlOutputBufferCreate* must always free the encoding handler.
2024-12-21 21:58:25 +01:00
Nick Wellnhofer
0dd910e82b save: Fix handling of catastrophic errors
Don't overwrite catastrophic errors xmlSaveErr.

Overwrite non-catastrophic errors in xmlOutputBufferClose.
2024-12-19 02:30:36 +01:00
Nick Wellnhofer
1e4d8c55f0 xmlIO: Fix reading from non-regular files like pipes
Commit 7e14c05d removed unnecessary copying of uncompressed input
through zlib or xzlib. This broke input from non-regular files like
pipes which can't be reopened. Try to detect such files by checking
whether they're seekable and always pipe them through zlib or xzlib.

Also remove seemingly unnecessary calls to gzread and gzrewind to
support unseekable files.

Fixes https://gitlab.gnome.org/GNOME/libxslt/-/issues/124.
2024-11-06 16:49:53 +01:00
Nick Wellnhofer
55ddccb645 io: Make sure not to pass partial UTF-8 to write callback
We cannot split UTF-8 at arbitrary boundaries.
2024-09-14 00:05:13 +02:00
triallax
67ff748c3e io: don't set the executable bit when creating files
Issue seems to have been introduced in
0bef93bf24.
2024-08-26 23:53:29 +01:00
Nick Wellnhofer
f2c48847fa io: Add missing calls to xmlInitParser
This is required after c9a46a91.

Should fix #782.
2024-08-13 14:38:59 +02:00
Nick Wellnhofer
a530ff125d io: Always consume encoding handler when creating output buffers
Also free encoding handler in error case.

Remove xmlAllocOutputBufferInternal which was identical to
xmlAllocOutputBuffer.
2024-07-29 14:25:39 +02:00
Nick Wellnhofer
36ea881b9d malloc-fail: Fix memory leak in xmlOutputBufferCreateFilename
Close encoding handler on error.
2024-07-26 18:07:27 +02:00
Nick Wellnhofer
7b98e8d695 io: Don't call getcwd in xmlParserGetDirectory
The "directory" value isn't used internally. Calling getcwd is
unnecessary and can cause problems in sandboxed environments.

Fixes #770.
2024-07-18 03:22:20 +02:00
Nick Wellnhofer
eb66d03ef7 io: Deprecate a few functions 2024-07-16 17:42:10 +02:00
Nick Wellnhofer
97680d6c08 io: Rework xmlParserInputBufferGrow
Remove dubious (len != 4) check.

Remove compression-related code. This should already be set when
opening the input.
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
a6f54f055b io: Fine-tune initial IO buffer size 2024-07-16 17:42:10 +02:00
Nick Wellnhofer
7148b77820 parser: Optimize memory buffer I/O
Reenable zero-copy IO for zero-terminated static memory buffers.

Don't stream zero-terminated dynamic memory buffers on top of creating
a copy.
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
34c9108f15 encoding: Add sizeOut argument to xmlCharEncInput
When push parsing, we want to convert as much of the input as possible.
When pull parsing memory buffers, we want to convert data chunk by chunk
to save memory.
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
a221cd7849 buf: Rework xmlBuf code
Always use what the old implementation called the "IO" allocation
scheme, allowing to move the content pointer past the initial
allocation. This is inexpensive and allows efficient shrinking.

Optimize xmlBufGrow, reusing shrunken memory as much as possible.

Simplify xmlBufAdd.

Make xmlBufBackToBuffer return an error on overflow.

Make "size" exclude the terminating NULL byte.

Always provide an initial size.

Reintroduce static buffers.

Remove xmlBufResize and several other functions.
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
8d1606265d entities: Rework text escaping 2024-07-16 17:42:10 +02:00
Nick Wellnhofer
cc45f618ae save: Rework text escaping
Stop using xmlOutputBufferWriteEscape except when using deprecated
xmlSaveSetEscape. Rewrite xmlOutputBufferWriteEscape to use an extra
buffer and call xmlOutputBufferWrite.

Introduce xmlSerializeText to serialize both text and attribute content.

Don't read encoding from document when serializing and remove all hacks
that temporarily changed the document's encoding.
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
0ab07b21dd io: Rework xmlOutputBufferWrite
Simplify code, handle short writes from callback.
2024-07-16 17:42:10 +02:00