libxml2

mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2026-01-26 21:41:34 +03:00

Author	SHA1	Message	Date
Nick Wellnhofer	ffb058f484	parser: Fix detection of duplicate attributes We really need a second scan if more than one namespace clash was detected.	2024-10-28 20:26:55 +01:00
Nick Wellnhofer	bd9eed4694	parser: Make unsupported encodings an error in declarations This was changed in `45157261`, but in encoding declarations, unsupported encodings should raise a fatal error. Fixes #794.	2024-09-02 19:29:39 +02:00
Nick Wellnhofer	4fefba4cf6	parser: Rework handling of undeclared entities Throw an error if entity substitution was requested. Now we only downgrade to a warning if - XML_PARSE_DTDLOAD wasn't specified, and - entity aren't substituted or XML_PARSE_NO_XXE was specified. Should fix #724.	2024-05-15 17:58:48 +02:00
Nick Wellnhofer	b717abdd09	parser: Consolidate error handling for undeclared entities Always use XML_WAR_UNDECLARED_ENTITY with warning error level in documents with external subset or parameter entities. Use XML_ERR_UNDECLARED_ENTITY otherwise.	2024-04-23 18:36:15 +02:00
Nick Wellnhofer	186562a182	parser: Fix detection of duplicate attributes in XML namespace Fixes a regression from commit `e0dd330b`, resulting in duplicate attributes in the predefined XML namespace not being detected or extraneous default attributes being passed. Fixes #704.	2024-03-12 20:02:52 +01:00
Nick Wellnhofer	29beef653c	parser: Pop inputs if parsing DTD failed This should provide some statistics in ctxt->sizeentcopy even in the error or recovery case.	2024-01-10 15:58:23 +01:00
Nick Wellnhofer	f237e5b934	parser: Avoid duplicate namespace errors Don't report an extra attribute uniqueness error if a namespace is undeclared. This matches old behavior.	2024-01-05 20:39:40 +01:00
Nick Wellnhofer	d0eb5a7e54	parser: Remove xmlErrEncodingInt Convert the last user to xmlFatalErr.	2024-01-04 15:28:57 +01:00
Nick Wellnhofer	e8fb3d639f	parser: Convert some "internal errors" to meaningful codes	2024-01-02 19:48:23 +01:00
Nick Wellnhofer	37c6618be5	parser: Rework parsing of attribute and entity values Don't use a separate function to handle "complex" attributes. Validate UTF-8 byte sequences without decoding. This should improve performance considerably when parsing multi-byte UTF-8 sequences. Use a string buffer to avoid unnecessary allocations and copying when expanding entities. Normalize attribute values in a single pass while expanding entities. Be more lenient in recovery mode. If no entity substitution was requested, validate entities without expanding. Fixes #596. Also fixes #655.	2024-01-02 15:42:03 +01:00
Nick Wellnhofer	4ecc85d2cb	parser: Push general entity input streams on the stack This allows the error handler to give more context.	2023-12-29 01:20:08 +01:00
Nick Wellnhofer	f3fa34dcad	parser: Fix general entity parsing Clear namespace database. Ignore non-fatal errors.	2023-12-28 16:47:41 +01:00
Nick Wellnhofer	ecfbcc8a52	parser: Rework general entity parsing Don't create a new parser context but reuse the existing one. This exposes bug #601 in a more obvious way.	2023-12-25 23:38:40 +01:00
Nick Wellnhofer	8d0aaf4b95	parser: Remove xmlErrEncoding Use xmlFatalErr or xmlCtxtErrIO.	2023-12-21 15:02:24 +01:00
Nick Wellnhofer	7e511f35f1	io: Pass error codes from xmlFileOpenReal to xmlNewInputFromFile This allows to report the reason why opening a file failed to the parser context and improve error messages. Now we can also remove the stat call before opening a file.	2023-12-21 15:02:24 +01:00
Nick Wellnhofer	e58ea29f17	SAX2: Report malloc failures Fix many places where malloc failures aren't reported. Improve error handling when parsing entity declarations. Fixes #308.	2023-12-11 22:13:05 +01:00
Nick Wellnhofer	a1f7ecaef8	entities: Report malloc failures Fix places where malloc failures aren't reported. Introduce new API function xmlAddEntity that returns separate error codes. Don't invoke global error handler for low-level errors which should be handled by higher layers. Invalid redelcaration warnings will be fixed later.	2023-12-11 22:05:47 +01:00
Nick Wellnhofer	43b511fa71	parser: Make CRLF increment line number Partial revert of `cb927e85` fixing CRLFs not incrementing the line number. This requires to rework xmlParseQNameHashed. The original implementation prompted the change to xmlCurrentChar which really shouldn't modify the 'cur' pointer as side effect. But the NEXTL macro relies on this behavior. Ultimately, we should reintroduce the change to xmlCurrentChar and fix the NEXTL macro. This will lead to single CRs incrementing the line number as well which seems more consistent. Fixes #628.	2023-11-26 15:18:09 +01:00
Nick Wellnhofer	134d2ad890	parser: Protect against quadratic default attribute expansion	2023-10-06 12:47:24 +02:00
Nick Wellnhofer	e48f3d8e0a	tests: Add more tests for redefined attributes	2023-09-29 12:43:08 +02:00
Nick Wellnhofer	53050b1dd8	parser: More fixes to push parser error handling	2023-08-29 20:06:43 +02:00
Nick Wellnhofer	bbd918b2e7	parser: Fix detection of null bytes Also suppress misleading extra errors. Fixes #122.	2023-08-29 18:43:10 +02:00
Nick Wellnhofer	c6083a32d6	parser: Improve error handling in push parser - Report errors earlier - Align error messages with pull parser	2023-08-29 18:41:05 +02:00
Nick Wellnhofer	855818bd2b	parser: Check for truncated multi-byte sequences When decoding input data, check whether the "raw" buffer is empty after parsing the document. Otherwise, the input ends with a truncated multi-byte sequence which shouldn't be silently ignored.	2023-08-08 15:21:37 +02:00
Nick Wellnhofer	74aa61e0bd	parser: Halt parser on DTD errors If we try to continue parsing after an error in the internal or external subset, entity expansion accounting gets more complicated. Simply halt the parser. Found with libFuzzer.	2023-01-24 11:32:15 +01:00
Nick Wellnhofer	d320a683d1	parser: Fix entity check in attributes Don't set the "checked" flag when checking entities in default attribute values. These entities could reference other entities which weren't defined yet, so the check isn't reliable. This fixes a short-lived regression which could lead to a call stack overflow later in xmlStringGetNodeList.	2023-01-17 13:59:24 +01:00
Nick Wellnhofer	a41b09c739	parser: Improve detection of entity loops Set a flag to detect entity loops at once instead of processing until the depth limit is exceeded.	2022-12-23 22:11:18 +01:00
Nick Wellnhofer	d972393f30	parser: Only report a single entity error Don't report errors multiple times for nested entity references.	2022-12-23 22:10:39 +01:00
Nick Wellnhofer	76c6da4209	error: Make sure that error messages are valid UTF-8 This has caused issues with the Python bindings for a long time. Should fix #64.	2022-12-04 23:34:19 +01:00
Nick Wellnhofer	68a6518c45	parser: Rewrite push parser boundary checks Remove inaccurate xmlParseCheckTransition check. Remove non-incremental xmlParseGetLasts check. Add functions that check for several boundary constructs more accurately, keeping track of progress in ctxt->checkIndex. Fixes #439.	2022-11-20 21:27:08 +01:00
Nick Wellnhofer	f61b8a6233	parser: Fix DTD parser progress checks This is another attempt at fixing parser progress checks. Instead of relying on in->consumed, which could overflow, change some DTD parser functions to make guaranteed progress on certain byte sequences.	2022-11-20 21:16:03 +01:00
Nick Wellnhofer	f1c32b4c78	Allow missing result files in runtest Treat missing files as empty.	2022-04-04 04:28:15 +02:00
Nick Wellnhofer	ce0871e15c	Only warn on invalid redeclarations of predefined entities Downgrade the error message to a warning since the error was ignored, anyway. Also print the name of redeclared entity. For a proper fix that also shows filename and line number of the invalid redeclaration, we'd have to - pass the parser context to the entity functions somehow, or - make these functions return distinct error codes. Partial fix for #308.	2022-02-20 21:49:04 +01:00
Nick Wellnhofer	9edc20c154	Fix double counting of CRLF in comments Fixes #151.	2022-02-07 20:54:07 +01:00
Nick Wellnhofer	de5b624f10	Fix handling of unexpected EOF in xmlParseContent Readd the XML_ERR_TAG_NOT_FINISHED error on unexpected EOF which was removed in commit `62150ed2`. This commit also introduced a regression for direct users of xmlParseContent. Unclosed tags weren't checked.	2021-05-08 20:47:36 +02:00
Nick Wellnhofer	3e80560d4b	Fix line numbers in error messages for mismatched tags Commit `62150ed2` introduced a small regression in the error messages for mismatched tags. This typically only affected messages after the first mismatch, but with custom SAX handlers all line numbers would be off. This also fixes line numbers in the SAX push parser which were never handled correctly.	2021-05-07 11:48:11 +02:00
Nick Wellnhofer	01411e7c5e	Check for invalid redeclarations of predefined entities Implement section "4.6 Predefined Entities" of the XML 1.0 spec and check whether redeclarations of predefined entities match the original definitions. Note that some test cases declared <!ENTITY lt "<"> But the XML spec clearly states that this is illegal: > If the entities lt or amp are declared, they MUST be declared as > internal entities whose replacement text is a character reference to > the respective character (less-than sign or ampersand) being escaped; > the double escaping is REQUIRED for these entities so that references > to them produce a well-formed result. Also fixes #217 but the connection is only tangential. The integer overflow discovered by fuzzing was more related to the fact that various parts of the parser disagreed on whether to prefer predefined entities over their redeclarations. The whole situation is a mess and even depends on legacy parser options. But now that redeclarations are validated, it shouldn't make a difference. As noted in the added comment, this is also one of the cases where overly defensive checks can hide interesting logic bugs from fuzzers.	2021-02-08 21:51:26 +01:00
Nick Wellnhofer	79301d3d5e	Fix timeout when handling recursive entities Abort parsing early to avoid an almost infinite loop in certain error cases involving recursive entities. Found with libFuzzer.	2020-12-18 14:13:46 +01:00
Nick Wellnhofer	32cb5dccda	Add test case for recursive external parsed entities	2020-02-11 17:36:43 +01:00
Nick Wellnhofer	f20daa9e51	Enable error tests with entity substitution	2020-02-11 17:36:43 +01:00
Nick Wellnhofer	c2f209c09f	Disallow conditional sections in internal subset Conditional sections are only allowed in external parameter entities referenced from the internal subset.	2019-09-30 15:47:30 +02:00
Nick Wellnhofer	c51e38cb3a	Make xmlParseConditionalSections non-recursive Avoid call stack overflow in deeply nested conditional sections. Found by OSS-Fuzz.	2019-09-30 15:47:30 +02:00
Nick Wellnhofer	62150ed2ab	Make xmlParseContent and xmlParseElement non-recursive Split xmlParseElement into subfunctions. Use nameNsPush to store prefix, URI and nsNr on the heap, similar to the push parser. Closes #84.	2019-09-23 17:45:50 +02:00
Nick Wellnhofer	f9fce96313	Fix unsigned integer overflow It's defined behavior but -fsanitize=unsigned-integer-overflow is useful to discover bugs.	2019-05-20 13:38:22 +02:00
Nick Wellnhofer	123234f2cf	Free input buffer in xmlHaltParser This avoids miscalculation of available bytes. Thanks to Yunho Kim for the report. Closes: #26	2018-09-11 15:06:17 +02:00
Nick Wellnhofer	69936b129f	Revert "Print error messages for truncated UTF-8 sequences" This reverts commit `79c8a6b` which caused a serious regression in streaming mode. Also reverts part of commit `52ceced` "Fix infinite loops with push parser in recovery mode". Fixes bug 786554.	2017-08-30 14:19:06 +02:00
Nick Wellnhofer	899a5d9f0e	Detect infinite recursion in parameter entities When expanding a parameter entity in a DTD, infinite recursion could lead to an infinite loop or memory exhaustion. Thanks to Wei Lei for the first of many reports. Fixes bug 759579.	2017-07-25 15:21:12 +02:00
Nick Wellnhofer	872fea9485	Get rid of "blanks wrapper" for parameter entities Now that replacement of parameter entities goes exclusively through xmlSkipBlankChars, we can account for the surrounding space characters there and remove the "blanks wrapper" hack.	2017-06-20 13:19:47 +02:00
Nick Wellnhofer	24246c7626	Fix xmlHaltParser Pop all extra input streams before resetting the input. Otherwise, a call to xmlPopInput could make input available again. Also set input->end to input->cur. Changes the test output for some error tests. Unfortunately, some fuzzed test cases were added to the test suite without manual cleanup. This makes it almost impossible to review the impact of later changes on the test output.	2017-06-20 13:15:43 +02:00
Nick Wellnhofer	8bbe4508ef	Spelling and grammar fixes Fixes bug 743172, bug 743489, bug 769632, bug 782400 and a few other misspellings.	2017-06-17 16:34:23 +02:00

1 2

75 Commits