Fix many params in internal functions (not really necessary but Doxygen
warns about that in XML mode).
Fix formatting in a few corner cases that automatic conversion can't
handle.
Rearrange some DOC_DISABLE blocks.
Swap arguments in initial call to xmlFARecurseDeterminism.
Fix the check whether we revisit the initial state in
xmlFARecurseDeterminism.
If there are transitions with equal atoms and targets but different
counters, treat the regex as deterministic but mark the transitions as
non-deterministic internally.
Don't overwrite zero return value of xmlFAComputesDeterminism
with non-zero value from xmlFARecurseDeterminism.
Most of these errors lead to non-deterministic regexes not being
detected which typically isn't an issue. The improved code may break
users who relied on buggy behavior or cause other bugs to become
visible.
Fixes#469.
The visited flag must only be reset after the first call to
xmlFAReduceEpsilonTransitions has finished. Visiting states multiple
times could lead to unnecessary processing of duplicate transitions.
Similar to 68eadabd.
Private functions were previously declared
- in header files in the root directory
- in public headers guarded with IN_LIBXML
- in libxml.h
- redundantly in source files that used them.
Consolidate all private header files in include/private.
This adds support for some non-standard escape sequences observed
in Microsoft's MSXML DLLs and used by Windows apps, and thus
needed by Wine. Some are also used in other XML implementations,
eg. Java's.
This isn't intended to be final. We probably wish to toggle these
non-standard escape sequences on and off somehow, as needed by
the caller.
Further discussion: https://gitlab.gnome.org/GNOME/libxml2/-/issues/260
When building the internal representation of a regexp, it is possible
that a lot of empty transitions are created. Therefore there is a step
to reduce them in the function xmlFAEliminateSimpleEpsilonTransitions.
There is an error there for this case:
* State 1 has a transition with an atom (in this case "a") to state 2.
* State 2 is final and has an epsilon transition to state 1.
After reduction it looked like:
* State 1 has a transition with an atom (in this case "a") to itself
and is final.
In other words, the empty string is accepted when it shouldn't be.
The attached patch skips the reduction step for final states.
An alternative would be to insert or increment counters when reducing a
final state, but this seemed error prone and unnecessary, since there
aren't that many final states.
Fixes#282