1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2026-01-26 21:41:34 +03:00
Commit Graph

933 Commits

Author SHA1 Message Date
Nick Wellnhofer
6a12be77c6 malloc-fail: Avoid use-after-free after unsuccessful valuePush
In xpath.c there's a lot of code like:

    valuePush(ctxt, xmlCacheNewX());
    ...
    valuePop(ctxt);

If xmlCacheNewX fails, no value will be pushed on the stack. If there's
no error check in between, valuePop will pop an unrelated value which
can lead to use-after-free errors.

Instead of trying to fix all call sites, we simply stop popping values
if an error was signaled. This requires to change the CHECK_TYPE macro
which is often used to determine whether a value can be safely popped.

Found with libFuzzer, see #344.
2023-02-03 12:40:15 +01:00
Nick Wellnhofer
59b3366178 error: Limit number of parser errors
Reporting errors is expensive and some abusive test cases can generate
an error for each invalid input byte. This causes the parser to spend
most of the time with error handling. Limit the number of errors and
warnings to 100.
2022-12-27 14:41:19 +01:00
Nick Wellnhofer
b47ebf047e parser: Deprecate xmlString*DecodeEntities
These are internal functions.
2022-12-21 21:06:03 +01:00
Nick Wellnhofer
ce76ebfd13 entities: Stop counting entities
This was only used in the old version of xmlParserEntityCheck.
2022-12-21 20:19:10 +01:00
Nick Wellnhofer
463bbeeca1 entities: Rework entity amplification checks
This commit implements robust detection of entity amplification attacks,
better known as the "billion laughs" attack.

We now limit the size of the document after substitution of entities to
10 times the size before expansion. This guarantees linear behavior by
definition. There already was a similar check before, but the accounting
of "sizeentities" (size of external entities) and "sizeentcopy" (size of
all copies created by entity references) wasn't accurate.

We also need saturation arithmetic since we're historically limited to
"unsigned long" which is 32-bit on many platforms.

A maximum of 10 MB of substitutions is always allowed. This should make
use cases like DITA work which have caused problems in the past.

The old checks based on the number of entities were removed. This is
accounted for by adding a fixed cost to each entity reference.

Entity amplification checks are now enabled even if XML_PARSE_HUGE is
set. This option is mainly used to allow larger text nodes. Most users
were unaware that it also disabled entity expansion checks.

Some of the limits might be adjusted later. If this change turns out to
affect legitimate use cases, we can add a separate parser option to
disable the checks.

Fixes #294.
Fixes #345.
2022-12-21 20:19:10 +01:00
Nick Wellnhofer
f34f184f8e entities: Add "flags" member to struct xmlEntity
This will hold various flags and eventually replace the "checked"
member.
2022-12-19 15:24:53 +01:00
Nick Wellnhofer
a6debffd7f xmlexports.h: Disable docs for internal macro XMLPUBLIC 2022-12-08 04:22:11 +01:00
Nick Wellnhofer
3b6cc47ab9 xmlexports.h: Remove LIBXML_FASTCALL optimization
This was an experimental and undocumented micro-optimization for
Windows which apparently required different calling conventions for
variable-argument functions, making it impossible to maintain without
domain knowledge.
2022-12-08 04:19:02 +01:00
Nick Wellnhofer
ce9baf94d5 Remove XMLCALL and XMLCDECL macros from public headers 2022-12-08 02:48:27 +01:00
Nick Wellnhofer
c73d464afb threads: Deprecate some internal functions 2022-11-25 15:12:56 +01:00
Chun-wei Fan
707ade225c Visual Studio builds: Allow silencing deprecation warnings
Define XML_IGNORE_DEPRECATION_WARNINGS and the corresponding XML_POP_WARNINGS
for Visual Studio, and consequently define XML_IGNORE_FPTR_CAST_WARNINGS so
that we do not get a compiler warning on Visual Studio by doing a
__pragma(warning(pop)) without a corresponding __pragma(warning(push)).

Also correct the documentation a bit for XML_POP_WARNINGS.
2022-11-23 11:04:38 +08:00
Chun-wei Fan
b9590d5d81 Visual Studio: Define XML_DEPRECATED
We can mark APIs as deprecated using __declspec(deprecated) with Visual Studio
2005 and later, so add a definition of that so that we can help users avoid
using deprecated APIs when using Visual Studio as well.

For the existing GCC definition, check whether we are on GCC 3.1+ before
enabling the definition.
2022-11-23 10:41:08 +08:00
Nick Wellnhofer
68a6518c45 parser: Rewrite push parser boundary checks
Remove inaccurate xmlParseCheckTransition check.

Remove non-incremental xmlParseGetLasts check.

Add functions that check for several boundary constructs more
accurately, keeping track of progress in ctxt->checkIndex.

Fixes #439.
2022-11-20 21:27:08 +01:00
Nick Wellnhofer
2059df5358 buf: Deprecate static/immutable buffers 2022-11-20 21:16:03 +01:00
Nick Wellnhofer
b693905f9b doc: Remove xmlDllMain from documentation and version script
This is a Windows-only symbol.
2022-11-04 14:50:39 +01:00
Nick Wellnhofer
a9669679f5 error: Don't use initGenericErrorDefaultFunc
The code in xmlInitParser did only set the error handler if it was NULL
which should never happen.
2022-09-09 13:52:48 +02:00
Nick Wellnhofer
0d90125859 Fix Windows compiler warnings in python/types.c 2022-09-04 18:36:04 +02:00
Nick Wellnhofer
1e60c76821 Remove HAVE_WIN32_THREADS configuration flag
Check for LIBXML_THREAD_ENABLED and _WIN32 instead.
2022-09-04 01:49:41 +02:00
Nick Wellnhofer
59f2f60e3e Remove "runtime debugging"
This doesn't seem useful as configuration option.
2022-09-02 18:33:35 +02:00
Nick Wellnhofer
65dc8a63ac Make xmlNewSAXParserCtx take a const sax handler
Also improve documentation.
2022-09-01 00:17:45 +02:00
Nick Wellnhofer
39dfb048a8 Don't forget to install xmlversion.h
Fixes 7f7961df.
2022-08-26 02:41:15 +02:00
Nick Wellnhofer
0f568c0b73 Consolidate private header files
Private functions were previously declared

- in header files in the root directory
- in public headers guarded with IN_LIBXML
- in libxml.h
- redundantly in source files that used them.

Consolidate all private header files in include/private.
2022-08-26 02:11:56 +02:00
Nick Wellnhofer
48f84ea8ed Remove internal macros from parserInternals.h
Replace MOVETO_ENDTAG with code that updates line and column numbers.
2022-08-25 21:31:08 +02:00
Nick Wellnhofer
58fc89e8a9 Deprecate internal parser functions 2022-08-25 21:04:57 +02:00
Nick Wellnhofer
a308c0cdf7 Deprecate old HTML SAX API 2022-08-25 21:04:57 +02:00
Nick Wellnhofer
51035c539e Generate deprecation warnings for old SAX API 2022-08-25 20:17:03 +02:00
Nick Wellnhofer
7f7961df25 Remove generated files from distribution
- libxml2.spec
- libxml-2.0.pc
- xml2-config
- include/libxml/xmlversion.h
- python/libxml2.py
- python/libxml2-export.c
- python/libxml2-py.c
- python/libxml2-py.h
- python/libxml2class.py
- python/libxml2class.txt
- python/setup.py
2022-08-25 14:35:31 +02:00
Nick Wellnhofer
34a050cdee Move some HTML functions to correct header file 2022-08-24 16:44:39 +02:00
Nick Wellnhofer
9a82b94a94 Introduce xmlNewSAXParserCtxt and htmlNewSAXParserCtxt
Add API functions to create a parser context with a custom SAX handler
without having to mess with ctxt->sax manually.
2022-08-24 14:07:55 +02:00
Nick Wellnhofer
92bb889be3 Don't index anything in DOC_DISABLE sections
Somewhat misleadingly, the DOC_DISABLE directive only disabled warnings.
Now we really stop the documentation generator from indexing.

This results in additional warnings for xmlThrDef* functions. This should
be fixed by documenting or deprecating them.
2022-08-24 00:51:52 +02:00
Nick Wellnhofer
75b5bc2b0a Deprecate some global variables
Most of these should be completely unused.
2022-08-24 00:31:10 +02:00
Nick Wellnhofer
5264fa1100 Fix warnings from apibuild.py 2022-08-18 21:31:36 +02:00
Nick Wellnhofer
e986d09cf5 Skip incorrectly opened HTML comments
Commit 4fd69f3e fixed handling of '<' characters not followed by an
ASCII letter. But a '<!' sequence followed by invalid characters should
be treated as bogus comment and skipped.

Fixes #380.
2022-08-02 14:38:09 +02:00
Mehltretter Karl
e9270ef0d4 fix Schematron spelling 2022-05-06 10:44:03 +02:00
Nick Wellnhofer
670701075b Add configuration flag for XPointer locations support
Add a new configuration flag that controls whether the outdated support
for XPointer locations (ranges and points) is enabled.

    --with-xptr-locs          # Autotools
    LIBXML2_WITH_XPTR_LOCS    # CMake

The latest spec for what it essentially an XPath extension seems to be
this working draft from 2002:

    https://www.w3.org/TR/xptr-xpointer/

The xpointer() scheme is listed as "being reviewed" in the XPointer
registry since at least 2006. libxml2 seems to be the only modern
software that tries to implement this spec, but the code has many bugs
and quality issues.

The flag defaults to "off" and support for this extensions has to be
requested explicitly. The relevant API functions are deprecated.
2022-04-21 02:41:58 +02:00
Nick Wellnhofer
aab584dc31 Clean up encoding switching code
- Remove xmlSwitchToEncodingInt which was basically just a wrapper
  around xmlSwitchInputEncodingInt.
- Simplify xmlSwitchEncoding.
- Improve error handling in xmlSwitchInputEncodingInt.
- Deprecate xmlSwitchInputEncoding.
2022-04-02 19:09:12 +02:00
Nick Wellnhofer
2070ade6df Undeprecate schema init functions
These functions aren't called from xmlInitParser, so it's better to keep
them public for now.
2022-03-10 01:24:31 +01:00
Nick Wellnhofer
40483d0ce2 Deprecate module init and cleanup functions
These functions shouldn't be part of the public API. Most init
functions are only thread-safe when called from xmlInitParser. Global
variables should only be cleaned up by calling xmlCleanupParser.
2022-03-06 15:59:43 +01:00
Nick Wellnhofer
4a8c71eb7c Remove DOCBparser
This code has been broken and deprecated since version 2.6.0, released
in 2003. Because of a bug in commit 961b535c, DOCBparser.c was never
compiled since 2012. I couldn't find a Debian package using any of its
symbols, so it seems safe to remove this module.
2022-03-04 22:56:21 +01:00
Nick Wellnhofer
ebb1797030 Remove unneeded #includes 2022-03-04 22:11:49 +01:00
Mike Dalessio
d7b287b94c htmlParseComment: handle abruptly-closed comments
See guidance provided on abrutply-closed comments here:

https://html.spec.whatwg.org/multipage/parsing.html#parse-error-abrupt-closing-of-empty-comment
2022-03-02 14:42:47 +00:00
Nick Wellnhofer
b66ce0bba8 Don't include ICU headers in public headers
There's no need to make these implementation details public.
2022-03-01 13:02:49 +01:00
Nick Wellnhofer
2489c1d024 Remove useless __CYGWIN__ checks
From what I can tell, some really early Cygwin versions from around
1998-2000 used to erroneously define _WIN32. This was eventually fixed,
but these days, the `defined(_WIN32) && !defined(__CYGWIN__)` idiom is
unnecessary.

Now, we only check for __CYGWIN__ in xmlexports.h when deciding whether
to use __declspec.
2022-02-28 22:58:35 +01:00
Nick Wellnhofer
004fe9de53 Deprecate IDREF-related functions in valid.h
These functions are only needed internally for validation.

xmlGetRefs is inherently unsafe because the ref table isn't updated
if attributes are removed (unlike the ids table).

None of the Ubuntu 20.04 packages depending on libxml2 use any of these
functions (except xmlFreeRefTable in libxslt), so it seems perfectly
safe to deprecate them.

Remove xmlIsRef and xmlRemoveRef from the Python bindings.
2022-02-20 21:49:05 +01:00
Nick Wellnhofer
61de92979b Deprecate all functions in DOCBparser.h 2022-02-20 21:49:05 +01:00
Nick Wellnhofer
cf4893f7b3 Deprecate legacy functions 2022-02-20 21:49:04 +01:00
Nick Wellnhofer
9e0ca5a19f Deprecate all functions in nanoftp.h 2022-02-20 21:49:04 +01:00
Nick Wellnhofer
a2fe74c08a Add XML_DEPRECATED macro
__attribute__((deprecated)) is available since at least GCC 3.1, so an
exact version check is probably unnecessary.
2022-02-20 21:49:04 +01:00
Nick Wellnhofer
d7cb33cf44 Rework validation context flags
Use a bitmask instead of magic values to

- keep track whether the validation context is part of a parser context
- keep track whether xmlValidateDtdFinal was called

This allows to add addtional flags later.

Note that this deliberately changes the name of a public struct member,
assuming that this was always private data never to be used by client
code.
2022-02-20 21:49:04 +01:00
Nick Wellnhofer
b041d829a2 Remove xmlwin32version.h
This file was undocumented and never used anywhere. Maybe users were
supposed to rename this file to xmlversion.h manually. These days, both
CMake and win32/configure.js generate xmlversion.h from xmlversion.h.in,
just like the Autotools build.
2022-02-16 19:55:30 +01:00