1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2026-01-26 21:41:34 +03:00
Commit Graph

1275 Commits

Author SHA1 Message Date
Nick Wellnhofer
bfe6af2eed fuzz: Remove hacks to build lint fuzzer
Don't include source file directly.
2025-01-17 20:06:45 +01:00
Nick Wellnhofer
e41941109d schemas: Make ValidateStream take a const SAXHandler 2025-01-17 20:05:57 +01:00
Nick Wellnhofer
c134e8b4dc include: Make INPUT_CHUNK macro private 2024-12-21 20:02:34 +01:00
Nick Wellnhofer
84a6c82ff8 include: Make most IS_* macros private
Macros like IS_DIGIT or IS_LETTER severely pollute the C namespace.
2024-12-21 20:01:30 +01:00
Nick Wellnhofer
2e18e5dc6d memory: Grow dynamic arrays by 50%
Growing by a factor lower than the golden ratio increases the chances of
reusing memory freed from earlier allocations. Set growth rate to 1.5
which also reduces internal fragmentation.
2024-12-21 19:37:38 +01:00
Nick Wellnhofer
5320a4aa38 memory: Implement xmlGrowCapacity to safely grow arrays
xmlGrowCapacity makes sure that dynamic arrays don't grow beyond an
explicit maximum size. size_t considerations are also taken into account.
A macro XML_MAX_ITEMS is provided as default maximum with value
1 billion.

When fuzzing, the initial size is set to 1 to cause more reallocations.
This can require adjustments if callers really need larger arrays.
2024-12-21 19:37:37 +01:00
Nick Wellnhofer
0dd910e82b save: Fix handling of catastrophic errors
Don't overwrite catastrophic errors xmlSaveErr.

Overwrite non-catastrophic errors in xmlOutputBufferClose.
2024-12-19 02:30:36 +01:00
Nick Wellnhofer
57087e5fc7 parser: Don't overwrite catastrophic errors
Stop reporting errors after a catastrophic error.

Also make sure that ctxt->errNo matches ctxt->lastError.code.
2024-11-26 00:47:48 +01:00
Nick Wellnhofer
0dc26910c1 parser: Deprecate more internal functions 2024-11-21 22:31:20 +01:00
Nick Wellnhofer
a227a71ac9 regexp: Deprecate internal functions 2024-11-20 17:03:11 +01:00
Nick Wellnhofer
0f4f89005d parser: Rename inputPush to xmlCtxtPushInput 2024-11-19 00:25:23 +01:00
Nick Wellnhofer
e2ad249c23 parser: Deprecate more internal symbols
- xmlParseExternalSubset
- xmlPushInput
- xmlPopInput
- xmlCopyCharMultiByte
- xmlCreateEntityParserCtxt
- xmlStringComment
2024-11-19 00:25:23 +01:00
Nick Wellnhofer
4d1f35b0a9 valid: Deprecate more internal functions 2024-11-19 00:03:37 +01:00
Nick Wellnhofer
5a51f08517 valid: Implement xmlCtxtValidateDocument
This allows to use the error handler or resource loader of a parser
context.
2024-11-19 00:03:37 +01:00
Nick Wellnhofer
7f8c436c75 parser: Implement xmlCtxtParseDtd and xmlCtxtValidateDtd
This allows to use the context's error handler, options and other
settings.

Fixes #808.
2024-11-15 16:30:52 +01:00
Nick Wellnhofer
0bc4608c50 html: Use hash table to check for duplicate attributes 2024-10-06 20:04:00 +02:00
Nick Wellnhofer
c32397d51f html: Improve character class macros 2024-10-06 20:04:00 +02:00
Nick Wellnhofer
c34d0ae9cc html: Deprecate htmlIsBooleanAttr 2024-10-06 20:04:00 +02:00
Nick Wellnhofer
6040785ac4 html: Deprecate AutoClose API 2024-10-06 20:04:00 +02:00
Nick Wellnhofer
188cad68a4 html: Remove obsolete content model 2024-10-06 20:04:00 +02:00
Nick Wellnhofer
462bf0b7a5 html: Rework options
Introduce htmlCtxtSetOptions, see similar changes made to XML parser.

Add HTML_PARSE_HUGE alias. Support HTML_PARSE_BIG_LINES.
2024-10-06 20:04:00 +02:00
Nick Wellnhofer
e062a4a9b3 html: Add HTML5 parser option
This option passes tokenizer output directly to the SAX callbacks,
making it possible to test the tokenizer against the html5lib test
suite.

This will produce unbalanced calls to the startElement and endElement
callbacks, but it's the only way to support a SAX like interface for
HTML5. It can be used for filtering or rewriting HTML5, for example.

A HTML5 tree builder could then be implemented on top of the SAX
callbacks.
2024-10-06 18:13:05 +02:00
Nick Wellnhofer
f9ed30e972 html: HTML5 character data states 2024-10-06 18:13:05 +02:00
Nick Wellnhofer
b1c5aa6544 xpath: Deprecate xmlXPathNAN and xmlXPath*INF
Users should simply use the C99 macros.
2024-09-19 12:50:59 +02:00
Nick Wellnhofer
c46b89e243 xpath: Deprecate xmlXPathEvalExpr
Also check the argument instead of crashing if there's no context.
2024-09-13 21:06:36 +02:00
Nick Wellnhofer
de10d4cd5f include: Check whether _MSC_VER is defined
Should fix #795.
2024-09-04 16:32:22 +02:00
makise-homura
a3043b478f threads: define _WIN32_WINNT as 0x0600 to use InitOnceExecuteOnce() 2024-08-16 22:26:07 +03:00
Nick Wellnhofer
a530ff125d io: Always consume encoding handler when creating output buffers
Also free encoding handler in error case.

Remove xmlAllocOutputBufferInternal which was identical to
xmlAllocOutputBuffer.
2024-07-29 14:25:39 +02:00
Nick Wellnhofer
aa6ca0b1d3 module: Deprecate module API
This was only used by libxslt which switched to a private
implementation.
2024-07-23 19:57:32 +02:00
Nick Wellnhofer
6a3c0b0d93 parser: Increase XML_MAX_DICTIONARY_LIMIT
This limit is somewhat arbitrary and can be reached when fuzzing
documents up to 1 MB.

Increase limit to 100 MB and disable limit if XML_PARSE_HUGE is set.
2024-07-22 12:53:00 +02:00
Nick Wellnhofer
4e93425a7f threads: Prefer Win32 over pthreads 2024-07-16 20:03:01 +02:00
Nick Wellnhofer
769e5a4a42 threads: Allocate global RMutexes statically
Avoid memory allocations during initialization.
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
5d36664fc9 memory: Deprecate xmlGcMemSetup 2024-07-16 17:42:10 +02:00
Nick Wellnhofer
79e119954c error: Make xmlLastError const 2024-07-16 17:42:10 +02:00
Nick Wellnhofer
eb66d03ef7 io: Deprecate a few functions 2024-07-16 17:42:10 +02:00
Nick Wellnhofer
a6f54f055b io: Fine-tune initial IO buffer size 2024-07-16 17:42:10 +02:00
Nick Wellnhofer
34c9108f15 encoding: Add sizeOut argument to xmlCharEncInput
When push parsing, we want to convert as much of the input as possible.
When pull parsing memory buffers, we want to convert data chunk by chunk
to save memory.
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
8e871a31f8 buf: Rework xmlBuffer code
Port most changes made to the xmlBuf code in f3807d76, except that
"size" still includes the terminating NULL byte.

Make xmlSetBufferAllocationScheme, xmlBufferAllocScheme and
xmlDefaultBufferSize no-ops.

Deprecate a few functions.
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
a221cd7849 buf: Rework xmlBuf code
Always use what the old implementation called the "IO" allocation
scheme, allowing to move the content pointer past the initial
allocation. This is inexpensive and allows efficient shrinking.

Optimize xmlBufGrow, reusing shrunken memory as much as possible.

Simplify xmlBufAdd.

Make xmlBufBackToBuffer return an error on overflow.

Make "size" exclude the terminating NULL byte.

Always provide an initial size.

Reintroduce static buffers.

Remove xmlBufResize and several other functions.
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
1cfc5b8089 entities: Rework serialization of numeric character references 2024-07-16 17:42:10 +02:00
Nick Wellnhofer
8d1606265d entities: Rework text escaping 2024-07-16 17:42:10 +02:00
Nick Wellnhofer
e488695b1a save: Deprecate xmlSaveSet*Escape
xmlSaveSetAttrEscape never had an effect.
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
e0494c0d43 io: Add some deprecation warnings 2024-07-15 16:33:38 +02:00
Nick Wellnhofer
728869809e error: Add helper functions to print errors and abort 2024-07-15 16:33:38 +02:00
Nick Wellnhofer
69f12d6d47 encoding: Deprecate xmlByteConsumed
This was only used by Chromium/WebKit to detect whether xmlParseContent
really succeeded. It's a horrible, overcomplicated hack.

See 8c5848bd and #767.
2024-07-13 15:42:02 +02:00
Nick Wellnhofer
8af55c8d20 parser: Rename new input API functions
These weren't made public yet.
2024-07-11 01:33:29 +02:00
Nick Wellnhofer
d74ca59491 parser: Rename internal xmlNewInput functions 2024-07-11 01:31:50 +02:00
Nick Wellnhofer
4f329dc524 parser: Implement xmlCtxtParseContent
This implements xmlCtxtParseContent, a better alternative to
xmlParseInNodeContext or xmlParseBalancedChunkMemory. It accepts a
parser context and a parser input, making it a lot more versatile.

xmlParseInNodeContext is now implemented in terms of
xmlCtxtParseContent. This makes sure that xmlParseInNodeContext never
modifies the target document, improving thread safety.
xmlParseInNodeContext is also more lenient now with regard to undeclared
entities.

Fixes #727.
2024-07-11 01:26:32 +02:00
Nick Wellnhofer
82e0455cf6 Undeprecate some symbols for now
- xmlKeepBlanksDefault is needed as a work-around for
  xmlParseBalancedChunk, see issue #727.
- ctxt->options already has an accessor and will be deprecated
  later.
- input->cur, input->base, input->end: See #762.
2024-07-06 20:19:51 +02:00
Nick Wellnhofer
38195cf596 parser: Don't produce names with invalid UTF-8 in recovery mode 2024-07-06 15:33:06 +02:00