1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2026-01-26 21:41:34 +03:00
Commit Graph

959 Commits

Author SHA1 Message Date
Nick Wellnhofer
868b94b80e globals: Reformat libxml/globals.h 2023-09-20 22:06:49 +02:00
Nick Wellnhofer
bbf08608fc globals: Move buffer callback declarations to xmlIO.h 2023-09-20 22:06:49 +02:00
Nick Wellnhofer
dc3382ef97 globals: Move xmlRegisterNodeDefault to tree.c
Code in globals.c must not try to access globals itself since the
accessor macros aren't defined and we would only see the main
variable.
2023-09-20 22:06:49 +02:00
Nick Wellnhofer
e7b6ca156f globals: Rework global state destruction on Windows
If DllMain is used, rely on it working as expected. The old code seemed
to attempt to free global state of other threads if, for some reason,
the DllMain mechanism didn't work.

In a static build, register a destructor with
RegisterWaitForSingleObject.

Make public functions xmlGetGlobalState and xmlInitializeGlobalState
no-ops.

Move initialization and registration of global state objects to
xmlInitGlobalState. Lookup global state with xmlGetThreadLocalStorage
which can be inlined nicely.

Also cleanup global state when using TLS. xmlLastError must be reset.
2023-09-20 22:06:49 +02:00
Nick Wellnhofer
39a275a541 globals: Define globals using macros
Declare and define globals and helper functions by (ab)using the
preprocessor.
2023-09-20 22:06:49 +02:00
Nick Wellnhofer
bf6bd16154 globals: Introduce xmlCheckThreadLocalStorage
Checks whether (emulated) thread-local storage could be allocated.
2023-09-20 22:06:43 +02:00
Nick Wellnhofer
89f4976728 globals: Make xmlGlobalState private
This removes a public struct but it seems impossible to use its members
in a sensible way from external code.
2023-09-19 17:36:29 +02:00
Nick Wellnhofer
4e1c13ebfd debug: Remove debugging code
This is barely useful these days and only clutters the code base.
2023-09-19 17:35:09 +02:00
Nick Wellnhofer
c19771c1f1 globals: Move code from threads.c to globals.c
Move all code that handles globals to the place where it belongs.
2023-09-19 17:34:38 +02:00
Nick Wellnhofer
2a4b811424 globals: Rename members of xmlGlobalState
This is a deliberate first step to remove some internals from the
public API and to avoid issues when redefining tokens.
2023-09-19 17:34:30 +02:00
Nick Wellnhofer
778cca386d legacy: Add stubs for disabled modules
When legacy support is requested, always enable stubs for FTP and
XPointer location modules which were removed from the standard
configuration. Going forward, the --with-legacy configuration option
should be used to provide maximum ABI compatibility.

Fixes #433.
2023-08-20 23:16:12 +02:00
Nick Wellnhofer
ed3bd05284 parser: Allow to set maximum amplification factor 2023-08-20 20:49:16 +02:00
Nick Wellnhofer
f1c1f5c6b4 parser: Revert change to doc->encoding
Fixes #579.
2023-08-17 12:47:14 +02:00
Nick Wellnhofer
ec7be50662 parser: Rework encoding detection
Introduce XML_INPUT_HAS_ENCODING flag for xmlParserInput which is set
when xmlSwitchEncoding is called. The parser can use the flag to
reliably detect whether an encoding was already set via user override,
BOM or other auto-detection. In this case, the encoding declaration
won't be used to switch the encoding.

Before, an inscrutable mix of ctxt->charset, ctxt->input->encoding
and ctxt->input->buf->encoder was used.

Introduce private helper functions to switch encodings used by both the
XML and HTML parser:

- xmlDetectEncoding which skips over the BOM, allowing to remove the
  BOM checks from other encoding functions.
- xmlSetDeclaredEncoding, replacing htmlCheckEncodingDirect, which warns
  about encoding mismatches.

If users override the encoding, store the declared instead of the actual
encoding in xmlDoc. In this case, the actual encoding is known and the
raw value from the doc is more useful.

Also use the input flags to store the ISO-8859-1 fallback state.
Restrict the fallback to cases where no encoding was specified. (The
fallback is only useful in recovery mode and these days broken UTF-8 is
probably more likely than ISO-8859-1, so it might eventually be removed
completely.)

The 'charset' member of xmlParserCtxt is now unused. The 'encoding'
member of xmlParserInput is now unused.

The 'standalone' member of xmlParserInput is renamed to 'flags'.

A new parser state XML_PARSER_XML_DECL is added for the push parser.
2023-08-08 15:19:46 +02:00
Nick Wellnhofer
b8961df65d SAX: Always validate xml:ids
The behavior shouldn't depend on mostly random configuration options.
2023-05-09 03:25:24 +02:00
Nick Wellnhofer
8d5e33ef3e Fix compiler warning on GCC < 8
-Wcast-function-type is only available since GCC 8.
2023-05-03 20:42:10 +02:00
Nick Wellnhofer
3ff6abbf58 encoding: Rework error codes
Use an enum instead of magic numbers. Fix a few error codes. Simplify
handling of "space" and "partial" errors.

See #506.
2023-04-30 16:43:29 +02:00
Nick Wellnhofer
fa993130f9 xpath: Remove remaining references to valueFrame
Fixes #529.
2023-04-30 13:18:17 +02:00
Nick Wellnhofer
3ffcc03b16 parser: Deprecate more internal functions 2023-04-26 20:23:23 +02:00
Nick Wellnhofer
98840d40da parser: Rework EBCDIC code page detection
To detect EBCDIC code pages, we used to switch the encoding twice and
had to be very careful not to decode data after the XML declaration
before the second switch. This relied on a hard-coded expected size of
the XML declaration and was complicated and unreliable.

Now we convert the first 200 bytes to EBCDIC-US and parse the encoding
declaration manually.
2023-03-21 21:35:15 +01:00
Nick Wellnhofer
f8efa589e8 malloc-fail: Handle malloc failures in xmlSchemaInitTypes
Note that this changes the return value of public function
xmlSchemaInitTypes from void to int. This shouldn't break the ABI on
most platforms.

Found when investigating #500.
2023-03-14 15:14:38 +01:00
Nick Wellnhofer
d7daf9fd96 xmllint: Fix use-after-free with --maxmem
Fixes #498.
2023-03-14 14:55:34 +01:00
Nick Wellnhofer
e7c3a4ca1b parser: Deprecate some parser input functions 2023-03-13 19:19:46 +01:00
Nick Wellnhofer
483793940c malloc-fail: Stop using XPath stack frames
There's too much code which assumes that if ctxt->value is non-null,
a value can be successfully popped off the stack. This assumption can
break with stack frames when malloc fails.

Instead of trying to fix all call sites, remove the stack frame logic.
It only offered very little protection against misbehaving extension
functions. We already check the stack size after a function call which
should be enough.

Found by OSS-Fuzz.
2023-03-13 17:11:27 +01:00
Nick Wellnhofer
bd63d730b8 html: Impose some length limits
Impose length limits on names, attribute values, PIs and comments,
similar to the XML parser.
2023-03-12 17:40:55 +01:00
Nick Wellnhofer
b51478dc95 Revert "malloc-fail: Avoid use-after-free after unsuccessful valuePush"
This reverts commit 6a12be77c6.

There's too much code reading ctxt->value directly and making the wrong
assumptions.
2023-02-26 13:23:47 +01:00
Nick Wellnhofer
6a12be77c6 malloc-fail: Avoid use-after-free after unsuccessful valuePush
In xpath.c there's a lot of code like:

    valuePush(ctxt, xmlCacheNewX());
    ...
    valuePop(ctxt);

If xmlCacheNewX fails, no value will be pushed on the stack. If there's
no error check in between, valuePop will pop an unrelated value which
can lead to use-after-free errors.

Instead of trying to fix all call sites, we simply stop popping values
if an error was signaled. This requires to change the CHECK_TYPE macro
which is often used to determine whether a value can be safely popped.

Found with libFuzzer, see #344.
2023-02-03 12:40:15 +01:00
Nick Wellnhofer
59b3366178 error: Limit number of parser errors
Reporting errors is expensive and some abusive test cases can generate
an error for each invalid input byte. This causes the parser to spend
most of the time with error handling. Limit the number of errors and
warnings to 100.
2022-12-27 14:41:19 +01:00
Nick Wellnhofer
b47ebf047e parser: Deprecate xmlString*DecodeEntities
These are internal functions.
2022-12-21 21:06:03 +01:00
Nick Wellnhofer
ce76ebfd13 entities: Stop counting entities
This was only used in the old version of xmlParserEntityCheck.
2022-12-21 20:19:10 +01:00
Nick Wellnhofer
463bbeeca1 entities: Rework entity amplification checks
This commit implements robust detection of entity amplification attacks,
better known as the "billion laughs" attack.

We now limit the size of the document after substitution of entities to
10 times the size before expansion. This guarantees linear behavior by
definition. There already was a similar check before, but the accounting
of "sizeentities" (size of external entities) and "sizeentcopy" (size of
all copies created by entity references) wasn't accurate.

We also need saturation arithmetic since we're historically limited to
"unsigned long" which is 32-bit on many platforms.

A maximum of 10 MB of substitutions is always allowed. This should make
use cases like DITA work which have caused problems in the past.

The old checks based on the number of entities were removed. This is
accounted for by adding a fixed cost to each entity reference.

Entity amplification checks are now enabled even if XML_PARSE_HUGE is
set. This option is mainly used to allow larger text nodes. Most users
were unaware that it also disabled entity expansion checks.

Some of the limits might be adjusted later. If this change turns out to
affect legitimate use cases, we can add a separate parser option to
disable the checks.

Fixes #294.
Fixes #345.
2022-12-21 20:19:10 +01:00
Nick Wellnhofer
f34f184f8e entities: Add "flags" member to struct xmlEntity
This will hold various flags and eventually replace the "checked"
member.
2022-12-19 15:24:53 +01:00
Nick Wellnhofer
a6debffd7f xmlexports.h: Disable docs for internal macro XMLPUBLIC 2022-12-08 04:22:11 +01:00
Nick Wellnhofer
3b6cc47ab9 xmlexports.h: Remove LIBXML_FASTCALL optimization
This was an experimental and undocumented micro-optimization for
Windows which apparently required different calling conventions for
variable-argument functions, making it impossible to maintain without
domain knowledge.
2022-12-08 04:19:02 +01:00
Nick Wellnhofer
ce9baf94d5 Remove XMLCALL and XMLCDECL macros from public headers 2022-12-08 02:48:27 +01:00
Nick Wellnhofer
c73d464afb threads: Deprecate some internal functions 2022-11-25 15:12:56 +01:00
Chun-wei Fan
707ade225c Visual Studio builds: Allow silencing deprecation warnings
Define XML_IGNORE_DEPRECATION_WARNINGS and the corresponding XML_POP_WARNINGS
for Visual Studio, and consequently define XML_IGNORE_FPTR_CAST_WARNINGS so
that we do not get a compiler warning on Visual Studio by doing a
__pragma(warning(pop)) without a corresponding __pragma(warning(push)).

Also correct the documentation a bit for XML_POP_WARNINGS.
2022-11-23 11:04:38 +08:00
Chun-wei Fan
b9590d5d81 Visual Studio: Define XML_DEPRECATED
We can mark APIs as deprecated using __declspec(deprecated) with Visual Studio
2005 and later, so add a definition of that so that we can help users avoid
using deprecated APIs when using Visual Studio as well.

For the existing GCC definition, check whether we are on GCC 3.1+ before
enabling the definition.
2022-11-23 10:41:08 +08:00
Nick Wellnhofer
68a6518c45 parser: Rewrite push parser boundary checks
Remove inaccurate xmlParseCheckTransition check.

Remove non-incremental xmlParseGetLasts check.

Add functions that check for several boundary constructs more
accurately, keeping track of progress in ctxt->checkIndex.

Fixes #439.
2022-11-20 21:27:08 +01:00
Nick Wellnhofer
2059df5358 buf: Deprecate static/immutable buffers 2022-11-20 21:16:03 +01:00
Nick Wellnhofer
b693905f9b doc: Remove xmlDllMain from documentation and version script
This is a Windows-only symbol.
2022-11-04 14:50:39 +01:00
Nick Wellnhofer
a9669679f5 error: Don't use initGenericErrorDefaultFunc
The code in xmlInitParser did only set the error handler if it was NULL
which should never happen.
2022-09-09 13:52:48 +02:00
Nick Wellnhofer
0d90125859 Fix Windows compiler warnings in python/types.c 2022-09-04 18:36:04 +02:00
Nick Wellnhofer
1e60c76821 Remove HAVE_WIN32_THREADS configuration flag
Check for LIBXML_THREAD_ENABLED and _WIN32 instead.
2022-09-04 01:49:41 +02:00
Nick Wellnhofer
59f2f60e3e Remove "runtime debugging"
This doesn't seem useful as configuration option.
2022-09-02 18:33:35 +02:00
Nick Wellnhofer
65dc8a63ac Make xmlNewSAXParserCtx take a const sax handler
Also improve documentation.
2022-09-01 00:17:45 +02:00
Nick Wellnhofer
39dfb048a8 Don't forget to install xmlversion.h
Fixes 7f7961df.
2022-08-26 02:41:15 +02:00
Nick Wellnhofer
0f568c0b73 Consolidate private header files
Private functions were previously declared

- in header files in the root directory
- in public headers guarded with IN_LIBXML
- in libxml.h
- redundantly in source files that used them.

Consolidate all private header files in include/private.
2022-08-26 02:11:56 +02:00
Nick Wellnhofer
48f84ea8ed Remove internal macros from parserInternals.h
Replace MOVETO_ENDTAG with code that updates line and column numbers.
2022-08-25 21:31:08 +02:00
Nick Wellnhofer
58fc89e8a9 Deprecate internal parser functions 2022-08-25 21:04:57 +02:00