1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2026-01-26 21:41:34 +03:00
Commit Graph

1026 Commits

Author SHA1 Message Date
Nick Wellnhofer
0821efc8ee encoding: Check whether encoding handlers support input/output
The "HTML" encoding handler doesn't support input which could lead to a
wrong error report.
2024-01-02 19:48:23 +01:00
Nick Wellnhofer
e8fb3d639f parser: Convert some "internal errors" to meaningful codes 2024-01-02 19:48:23 +01:00
Nick Wellnhofer
a2cc7f5f04 parser: Set depth limit to 2048 with XML_PARSE_HUGE
Deeply nested documents can cause performance problems, so the nesting
depth should always be limited to a reasonable value.

Also remove the global xmlParserMaxDepth setting which isn't thread-safe
and seems unused.
2024-01-02 19:42:06 +01:00
Nick Wellnhofer
875bb08489 parser: Implement xmlCtxtSetOptions
Surprisingly, some options can only be enabled with xmlCtxtUseOptions
and it's impossible to unset them. Add a new API function
xmlCtxtSetOptions which sets or clears all options.

Finally document all parser options.

Make sure to synchronize option bits and struct members.
2024-01-02 19:42:06 +01:00
Nick Wellnhofer
2b79f106ff parser: Simplify entity size accounting 2024-01-02 14:17:27 +01:00
Nick Wellnhofer
7e0bbbc143 parser: New input API
Provide a new set of functions to create xmlParserInputs. These can be
used for the document entity or from external entity loaders.

- Don't require xmlParserInputBuffer.
- All functions take a base URI.
- All functions take an encoding as string.
- xmlNewInputURL also takes a public ID.
- xmlNewInputMemory takes a size_t.
- Optimization hints for memory buffers.

Improve documentation.

Only call xmlInitParser before allocating a new parser context.

Call xmlCtxtUseOptions as early as possible.
2023-12-29 01:22:13 +01:00
Nick Wellnhofer
d025cfbb4b parser: Always copy content from entity to target.
Make sure that references from IDs are updated.

Note that if there are IDs with the same value in a document, the last
one will now be returned. IDs should be unique, but maybe this should be
addressed.
2023-12-29 01:22:11 +01:00
Nick Wellnhofer
a5dcf0f422 parser: Mark more parser context members as unused 2023-12-29 01:20:08 +01:00
Nick Wellnhofer
6a9a88a17f parser: Move progressive flag into input struct 2023-12-29 01:20:08 +01:00
Nick Wellnhofer
d944a41515 parser: Fix in-parameter-entity and in-external-dtd checks
Use in ctxt->input->entity instead of ctxt->inputNr to determine whether
we are inside a parameter entity.

Stop using ctxt->external to check whether we're in an external DTD.
This is signaled by ctxt->inSubset == 2.
2023-12-29 01:19:56 +01:00
Nick Wellnhofer
c1bddd4c26 parser: Mark 'length' member of xmlParserInput as unused 2023-12-25 23:38:40 +01:00
Nick Wellnhofer
955c177f69 parser: Stop using 'directory' struct member
This was only used as a pointless fallback for URI resolution.
2023-12-25 23:38:40 +01:00
Nick Wellnhofer
c73de050f5 include: Move non-generated parts from xmlversion.h.in
xmlexports.h originally only included symbol visibility macros but it's
a good place for other macros as well.
2023-12-25 23:38:40 +01:00
Nick Wellnhofer
229e5ff7f9 io: Remove support for HTTP POST
This feature is unlikely to be used these days.
2023-12-24 22:11:49 +01:00
Nick Wellnhofer
23345a1cb1 io: Report IO errors through xmlCtxtErrIO
This is also a new public API function to be used in external entity
loaders.
2023-12-21 15:02:24 +01:00
Nick Wellnhofer
1ef3566362 io: Always use unbuffered input
Before, we often used unbuffered input via the lzma or gzip handlers,
more or less inadvertently.

Change the default file handlers from buffered (stdc FILE) to unbuffered
(POSIX fds).
2023-12-21 15:02:24 +01:00
Nick Wellnhofer
b2dbcc432b io: Rework default callbacks
Register a dummy callback struct for default callbacks. Handle them in a
separate function which will later allow to return meaningful error
codes.
2023-12-21 15:02:24 +01:00
Nick Wellnhofer
2829a21a95 xinclude: Improve error handling
Introduce xmlXIncludeSetErrorHandler allowing to set a structured error
handler for an XInclude context and forwarding errors from parser.

Remove arguments from memory error handlers.

Use xmlRaiseMemoryError.
2023-12-21 02:46:27 +01:00
Nick Wellnhofer
954b898494 xpath: Improve error handling
Introduce xmlXPathSetErrorHandler allowing to set a structured error
handler for an XPath context.

Remove arguments from memory error handlers.

Use xmlRaiseMemoryError.

Remove TODO, STRANGE and CHECK_CTXT macros.

Remove remaining uses of xmlGenericError.
2023-12-21 02:46:27 +01:00
Nick Wellnhofer
54c70ed57f parser: Improve error handling
Introduce xmlCtxtSetErrorHandler allowing to set a structured error for
a parser context. There already was the "serror" SAX handler but this
always receives the parser context as argument.

Start to use xmlRaiseMemoryError.

Remove useless arguments from memory error functions. Rename
xmlErrMemory to xmlCtxtErrMemory.

Remove a few calls to xmlGenericError.

Remove support for runtime entity debugging.
2023-12-21 02:46:27 +01:00
Nick Wellnhofer
5d2dbe79fa parser: Fix build --without-output
Fixes #647
2023-12-14 13:48:41 +01:00
Nick Wellnhofer
c2bbeed1fd io: Fix memory lifetime issue with input buffers
xmlParserInputBufferCreateMem must make a copy of the buffer.

This fixes a regression from 2.11 which could cause reads from freed
memory depending on the use case.

Undeprecate xmlParserInputBufferCreateStatic which can avoid copying
the whole buffer.
2023-12-12 23:51:32 +01:00
Nick Wellnhofer
157df34401 xmlreader: Report malloc failures
Fix many places where malloc failures aren't reported.

Introduce a new API function xmlTextReaderGetLastError.
2023-12-11 22:13:06 +01:00
Nick Wellnhofer
78eab7a130 xinclude: Report malloc failures
Fix many places where malloc failures aren't reported.

Introduce a new API function xmlXIncludeGetLastError.
2023-12-11 22:13:05 +01:00
Nick Wellnhofer
f19a95108a parser: Report malloc failures
Fix many places where malloc failures aren't reported.

Make xmlErrMemory public. This is useful for custom external entity
loaders.

Introduce new API function xmlSwitchEncodingName.

Change the way how we store whether the the parser is stopped. This used
to be signaled by setting ctxt->instate to XML_PARSER_EOF which was
misdesigned and error-prone. Set ctxt->disableSAX to 2 instead and
introduce a macro PARSER_STOPPED. Also stop to remove parser inputs in
xmlHaltParser. This allows to remove many checks of ctxt->instate.

Introduce xmlErrParser to handle errors if a parser context is
available.
2023-12-11 22:13:05 +01:00
Nick Wellnhofer
0d97e43993 save: Report malloc failures
Fix places where malloc failures aren't report.

Introduce a new API function xmlSaveFinish which returns an error code.
2023-12-11 22:13:05 +01:00
Nick Wellnhofer
aca16fb3d4 tree: Report malloc failures
Fix many places where malloc failures aren't reported.

Make some API function return an error code. Changing the return type
from void to int is technically an ABI break but should be safe on most
platforms.

- xmlNodeSetContent
- xmlNodeSetContentLen
- xmlNodeAddContent
- xmlNodeAddContentLen
- xmlNodeSetBase

Introduce new API functions that return a separate error code if a
memory allocation fails.

- xmlNodeGetAttrValue
- xmlNodeGetBaseSafe
- xmlGetNsListSafe

Introduce private functions xmlTreeEnsureXMLDecl and xmlSplitQName4.

Don't report low-level errors to the global error handler.

Fix tree

Introduce xmlGetNsListSafe

Fix tree
2023-12-11 22:13:05 +01:00
Nick Wellnhofer
e34a49b78e valid: Improve addition and deletion of IDs
Introduce a new API function xmlAddIDSafe that returns a separate error
code if a memory allocation fails.

Store a pointer to the ID struct in xmlAttr so attributes can be
freed without allocating memory. It's impossible to report malloc
failures in deallocation code.
2023-12-11 22:13:05 +01:00
Nick Wellnhofer
e1560990ec pattern: Report malloc failures
Fix places where malloc failures aren't reported.

Introduce a new API function xmlPatternCompileSafe that returns a
separate error code if a memory allocation fails.
2023-12-11 22:05:47 +01:00
Nick Wellnhofer
a1f7ecaef8 entities: Report malloc failures
Fix places where malloc failures aren't reported.

Introduce new API function xmlAddEntity that returns separate error
codes.

Don't invoke global error handler for low-level errors which should be
handled by higher layers.

Invalid redelcaration warnings will be fixed later.
2023-12-11 22:05:47 +01:00
Nick Wellnhofer
f313848bd8 hash: Report malloc failures
Introduce new API functions that return a separate error code if a
memory allocation fails.

- xmlHashAdd
- xmlHashCopySafe
2023-12-11 22:05:47 +01:00
Nick Wellnhofer
bd5ad0308d encoding: Report malloc failures
Introduce new API functions that return a separate error code if a
memory allocation fails.

- xmlOpenCharEncodingHandler
- xmlLookupCharEncodingHandler

Fix a few places where malloc failures weren't reported.
2023-12-11 22:05:47 +01:00
Nick Wellnhofer
da996c8d0f uri: Report malloc failures
Fix many places where malloc failures weren't reported, for example
after calling xmlStrdup.

Introduce new public API functions that return a separate error code if
a memory allocation fails:

- xmlParseURISafe
- xmlBuildURISafe
- xmlBuildRelativeURISafe

Update the fuzzer to check whether malloc failures are reported.
2023-12-11 22:05:47 +01:00
Nick Wellnhofer
df0b540b3e include: Rename XML_EMPTY helper macro
Avoid name clash with downstream projects.
2023-12-07 14:59:47 +01:00
Nick Wellnhofer
a9738e311c include: Move declaration of xmlInitGlobals
Fix downstream build issues after reworking globals.h.
2023-12-07 14:59:40 +01:00
Nick Wellnhofer
52703ffdf1 include: Add missing includes 2023-12-07 12:31:16 +01:00
Nick Wellnhofer
9122ad0ce6 include: Move globals from xmlsave.h to parser.h
Fix downstream build issues after reworking globals.h.
2023-12-07 12:31:06 +01:00
Nick Wellnhofer
c011e7605d globals: Remove unused globals from thread storage
Setting these deprecated globals hasn't had an effect for a long time.
Make them constants. This reduces the size of per-thread storage from
~700 to ~250 bytes.
2023-12-06 20:07:54 +01:00
Nick Wellnhofer
1c7f4c70fe nanohttp: Deprecate public API
The long-term plan is to remove the built-in HTTP client. There are
still a few downstream projects that use libxml2's client for other
purposes. These users should get deprecation warnings now.
2023-11-27 13:43:06 +01:00
Nick Wellnhofer
7d6969d955 Remove Trio
Trio is a rather old cross-platform printf library which was bundled with
libxml2. It was needed for ancient pre-C99 systems without snprintf and
should be safe to remove these days.
2023-11-23 15:48:52 +01:00
Nick Wellnhofer
ff6c318862 include: Remove useless 'const' from function arguments 2023-11-23 15:27:00 +01:00
makise-homura
6bc86405d1 Avoid EDG deprecation warnings for LCC compiler 2023-11-22 05:34:56 +00:00
Nick Wellnhofer
aca37d8c77 parser: Only enable SAX2 if there are SAX2 element handlers
This reverts part of commit 235b15a5 for backward compatibility and
adds some comments trying to clarify the whole mess.

Fixes #623.
2023-11-20 15:20:37 +01:00
Nick Wellnhofer
61034116d0 error: Make more xmlError structs constant
Prepare for future changes, see 45470611.
2023-10-24 15:02:36 +02:00
Nick Wellnhofer
253f260bb1 threads: Fix --with-thread-alloc
Fixes #606.
2023-10-18 20:07:04 +02:00
Nick Wellnhofer
713ded60ad entities: Make xmlFreeEntity public 2023-10-06 10:47:07 +02:00
Nick Wellnhofer
e0dd330b8f parser: Use hash tables to avoid quadratic behavior
Use a hash table to lookup namespaces by prefix. The hash table stores
an index into the namespace table. Auxiliary data for namespaces is
stored in a separate array along the main namespace table.

Use a hash table to verify attribute uniqueness. The hash table stores
an index into the attribute table.

Reuse hash value from the dictionary to avoid computing them twice.

See #346.
2023-09-29 12:43:22 +02:00
Nick Wellnhofer
b31813e60c include: Add more missing stdio.h includes 2023-09-28 15:34:08 +02:00
Nick Wellnhofer
84e1ffc813 doc: Don't document internal macros in xmlversion.h 2023-09-22 19:01:11 +02:00
Nick Wellnhofer
b94283fbda regexp: Add missing include 2023-09-22 14:23:27 +02:00