1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2025-10-21 14:53:44 +03:00
Commit Graph

173 Commits

Author SHA1 Message Date
Nick Wellnhofer
9bbffec568 doc: Move brief to top, params to bottom of doc comments 2025-05-06 19:51:38 +02:00
Nick Wellnhofer
e78e05c990 doc: Fix autolinks to functions
Unfortunately, autolinks in .c files aren't converted by Doxygen for
some reason.
2025-05-02 17:45:31 +02:00
Nick Wellnhofer
f7c412874b doc: Remove more comment block headers 2025-05-02 17:41:26 +02:00
Nick Wellnhofer
e525564f65 doc: Remove empty lines at start of block
These lines were left over after automatic conversion.
2025-05-02 11:42:05 +02:00
Nick Wellnhofer
e549622bc5 doc: Convert documentation to Doxygen
Automated conversion based on a few regexes.
2025-05-01 23:23:42 +02:00
Nick Wellnhofer
69879da88f doc: Remove email addresses from documentation
Also remove authorship information from generated files, hash.c and
globals.c which were rewritten.
2025-05-01 23:23:42 +02:00
Nick Wellnhofer
61890e399d doc: Prepare for conversion to Doxygen
Fix many params in internal functions (not really necessary but Doxygen
warns about that in XML mode).

Fix formatting in a few corner cases that automatic conversion can't
handle.

Rearrange some DOC_DISABLE blocks.
2025-05-01 23:23:42 +02:00
Nick Wellnhofer
03a8d5f93d unicode: Make Unicode functions private 2025-03-04 17:31:11 +01:00
Nick Wellnhofer
6fc260760a regexp: Hide debugging code behind DEBUG_REGEXP
xmlRegexpPrint is now a deprecated no-op.
2025-02-22 20:55:06 +01:00
Florin Haja
4649f28f77 xmlregexp: add support for compact form of automata in xmlRegexpPrint 2025-02-22 19:29:07 +00:00
Nick Wellnhofer
c82270a9a7 regexp: Avoid dangling start/stop pointers in atom
States could be eliminated later, so set start/stop pointers to NULL
after they're used in xmlFAGenerateTransitions.
2025-02-22 18:55:43 +01:00
Nick Wellnhofer
9c16a153d8 Revert "include: Make most IS_* macros private"
This reverts commit 84a6c82ff8.
2025-02-13 20:20:17 +01:00
Nick Wellnhofer
84a6c82ff8 include: Make most IS_* macros private
Macros like IS_DIGIT or IS_LETTER severely pollute the C namespace.
2024-12-21 20:01:30 +01:00
Nick Wellnhofer
0d6136da21 regexp: Check reallocations for overflow 2024-12-21 19:37:38 +01:00
Nick Wellnhofer
5d36664fc9 memory: Deprecate xmlGcMemSetup 2024-07-16 17:42:10 +02:00
Nick Wellnhofer
2dcd561dc8 regexp: Don't print to stderr 2024-07-15 16:33:38 +02:00
Nick Wellnhofer
6be79014d7 Remove unused code 2024-07-15 16:33:38 +02:00
Nick Wellnhofer
598ee0d2c6 error: Remove underscores from xmlRaiseError 2024-06-27 14:43:10 +02:00
Rosen Penev
217e9b7af2 clang-tidy: don't return in void functions
Found with readability-redundant-control-flow

Signed-off-by: Rosen Penev <rosenp@gmail.com>
2024-06-20 20:37:34 +00:00
Nick Wellnhofer
fa01278dcd regexp: Hide experimental legacy code
This was never made public.
2024-06-16 18:47:12 +02:00
Nick Wellnhofer
10d60d15d6 regexp: Stop using LIBXML_AUTOMATA_ENABLED
This macro always equals LIBXML_REGEXP_ENABLED.
2024-06-16 18:47:12 +02:00
Nick Wellnhofer
0651ad667c valid: Report malloc failure after xmlRegExecPushString 2024-05-13 13:08:14 +02:00
Nick Wellnhofer
05d9bacd05 regexp: Improve error handling
Handle malloc failure from xmlRaiseError.

Use xmlRaiseMemoryError.

Remove argument from memory error handler.

Remove TODO macro.
2023-12-21 15:02:24 +01:00
Nick Wellnhofer
1a354d5b30 regexp: Report malloc failures
Fix places where malloc failures aren't reported.
2023-12-11 22:13:05 +01:00
Nick Wellnhofer
3e7673bc2d malloc-fail: Report malloc failure in xmlFARegExec 2023-09-29 00:15:40 +02:00
Nick Wellnhofer
b7d56ef7f1 malloc-fail: Report malloc failure in xmlRegEpxFromParse
Also check whether malloc failures are reported when fuzzing.
2023-09-22 19:53:11 +02:00
Nick Wellnhofer
f98fa86318 regexp: Fix status codes and handle invalid UTF-8
Fixes #561.
2023-09-22 19:01:11 +02:00
Nick Wellnhofer
4e1c13ebfd debug: Remove debugging code
This is barely useful these days and only clutters the code base.
2023-09-19 17:35:09 +02:00
Nick Wellnhofer
a800b7e058 regexp: Fix null deref in xmlFAFinishReduceEpsilonTransitions
Short-lived regression found by OSS-Fuzz.
2023-05-04 12:47:00 +02:00
Nick Wellnhofer
c613ab14b8 regexp: Fix mistake in previous commit
The `ret = 0` line should have been deleted.

Fixes #531.
2023-05-02 00:32:50 +02:00
Nick Wellnhofer
a06eaa6119 regexp: Fix determinism checks
Swap arguments in initial call to xmlFARecurseDeterminism.

Fix the check whether we revisit the initial state in
xmlFARecurseDeterminism.

If there are transitions with equal atoms and targets but different
counters, treat the regex as deterministic but mark the transitions as
non-deterministic internally.

Don't overwrite zero return value of xmlFAComputesDeterminism
with non-zero value from xmlFARecurseDeterminism.

Most of these errors lead to non-deterministic regexes not being
detected which typically isn't an issue. The improved code may break
users who relied on buggy behavior or cause other bugs to become
visible.

Fixes #469.
2023-04-30 22:37:11 +02:00
Nick Wellnhofer
e301865e69 regexp: Fix checks for eliminated transitions
'to' can be set to -1 or -2 when eliminating transitions, so check for
all negative values.
2023-04-30 22:36:51 +02:00
Nick Wellnhofer
90759c598d regexp: Simplify xmlFAReduceEpsilonTransitions 2023-04-30 22:36:41 +02:00
Nick Wellnhofer
9f7b114232 regexp: Fix cycle check in xmlFAReduceEpsilonTransitions
The visited flag must only be reset after the first call to
xmlFAReduceEpsilonTransitions has finished. Visiting states multiple
times could lead to unnecessary processing of duplicate transitions.

Similar to 68eadabd.
2023-04-30 22:36:33 +02:00
Nick Wellnhofer
85057e5131 regexp: Add sanity check in xmlRegCalloc2
These arguments should be non-zero, but add a sanity check to avoid
division by zero.

Fixes #450.
2023-02-21 15:43:32 +01:00
Nick Wellnhofer
1743c4c3fc malloc-fail: Fix OOB read after xmlRegGetCounter
Found with libFuzzer, see #344.
2023-02-17 17:18:59 +01:00
Nick Wellnhofer
40bc1c699a malloc-fail: Fix memory leak in xmlFAParseCharProp
Found with libFuzzer, see #344.
2023-02-17 17:18:55 +01:00
Nick Wellnhofer
e64653c0e7 malloc-fail: Fix leak of xmlRegAtom
Found with libFuzzer, see #344.
2023-02-17 17:18:55 +01:00
Nick Wellnhofer
ed615967df malloc-fail: Fix memory leak in xmlRegexpCompile
Found with libFuzzer, see #344.
2023-02-17 17:18:55 +01:00
Nick Wellnhofer
e60c9f4c4b malloc-fail: Fix memory leak after xmlRegNewState
Invoke xmlRegNewState from xmlRegStatePush to simplify error handling.

Found with libFuzzer, see #344.
2023-02-17 17:16:51 +01:00
Nick Wellnhofer
bd33331bb9 regexp: Simplify xmlRegAtomPush 2023-02-17 17:16:50 +01:00
Nick Wellnhofer
0f568c0b73 Consolidate private header files
Private functions were previously declared

- in header files in the root directory
- in public headers guarded with IN_LIBXML
- in libxml.h
- redundantly in source files that used them.

Consolidate all private header files in include/private.
2022-08-26 02:11:56 +02:00
Nick Wellnhofer
145170125a Fix parsing of subtracted regex character classes
Fixes #370.
2022-04-23 19:22:42 +02:00
Nick Wellnhofer
ebb1797030 Remove unneeded #includes 2022-03-04 22:11:49 +01:00
Damjan Jovanovic
37ebf8a8b2 Document support for the non-standard escape sequences.
Support non-BMP code points in surrogate pairs of '\uXXXX\uXXXX'.
2022-03-02 15:25:21 +00:00
Damjan Jovanovic
b66c19612c Use strtoul() instead of sscanf, and correct data types that break GCC. 2022-03-02 15:25:21 +00:00
Damjan Jovanovic
ec8ff95ce3 Add support for some non-standard escapes in regular expressions.
This adds support for some non-standard escape sequences observed
in Microsoft's MSXML DLLs and used by Windows apps, and thus
needed by Wine. Some are also used in other XML implementations,
eg. Java's.

This isn't intended to be final. We probably wish to toggle these
non-standard escape sequences on and off somehow, as needed by
the caller.

Further discussion: https://gitlab.gnome.org/GNOME/libxml2/-/issues/260
2022-03-02 15:25:21 +00:00
Nick Wellnhofer
776d15d383 Don't check for standard C89 headers
Don't check for

- ctype.h
- errno.h
- float.h
- limits.h
- math.h
- signal.h
- stdarg.h
- stdlib.h
- string.h
- time.h

Stop including non-standard headers

- malloc.h
- strings.h
2022-03-02 00:43:54 +01:00
Nick Wellnhofer
ea6e8f998d Fix certain combinations of regex range quantifiers
Fix regex transitions that have both min/max and a counter. In this
case, we want to save the regex state before incrementing the counter.

Fixes #301 and the issue reported here:

https://mail.gnome.org/archives/xml/2016-April/msg00017.html
2022-02-28 16:56:02 +01:00
Nick Wellnhofer
382fb056b5 Fix range quantifier on subregex
Make sure to add counted exit transitions before other counter
transitions. Otherwise, we won't backtrack correctly.

Fixes #65.
2022-02-28 16:56:02 +01:00