Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						dc2d498318 
					 
					
						
						
							
							html: Rework htmlLookupSequence  
						
						... 
						
						
						
						Rename to htmlLookupString and use strstr for increased performance. 
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						637215a4de 
					 
					
						
						
							
							html: Always terminate doctype declarations on '>'  
						
						... 
						
						
						
						Align with HTML5 spec. This allows to remove the old quote handling in
htmlLookupSequence. 
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						72e29f9a3d 
					 
					
						
						
							
							html: Fix quadratic behavior in push parser  
						
						... 
						
						
						
						Fix quadratic behavior related to unquoted attribute values. We really
have to replicate parts of the HTML5 state machine to find the end of
tags relibably.
Fixes  #533 . 
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						a80f8b64a9 
					 
					
						
						
							
							html: Allow attributes in end tags  
						
						... 
						
						
						
						Attribute are syntactically allowed in HTML5 end tags but otherwise
ignored. 
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						f2272c231b 
					 
					
						
						
							
							html: Handle unexpected-solidus-in-tag according to HTML5  
						
						
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						939b53ee12 
					 
					
						
						
							
							html: Stop skipping tag content  
						
						... 
						
						
						
						Tag and attributes names should always be parsed succesfully now. 
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						dcb2abb2fe 
					 
					
						
						
							
							html: Parse tag and attribute names according to HTML5  
						
						... 
						
						
						
						HTML5 allows bascially all characters in tag and attribute names. 
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						5d36664fc9 
					 
					
						
						
							
							memory: Deprecate xmlGcMemSetup  
						
						
						
						
					 
					
						2024-07-16 17:42:10 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						8af55c8d20 
					 
					
						
						
							
							parser: Rename new input API functions  
						
						... 
						
						
						
						These weren't made public yet. 
						
						
					 
					
						2024-07-11 01:33:29 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						d74ca59491 
					 
					
						
						
							
							parser: Rename internal xmlNewInput functions  
						
						
						
						
					 
					
						2024-07-11 01:31:50 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						4f329dc524 
					 
					
						
						
							
							parser: Implement xmlCtxtParseContent  
						
						... 
						
						
						
						This implements xmlCtxtParseContent, a better alternative to
xmlParseInNodeContext or xmlParseBalancedChunkMemory. It accepts a
parser context and a parser input, making it a lot more versatile.
xmlParseInNodeContext is now implemented in terms of
xmlCtxtParseContent. This makes sure that xmlParseInNodeContext never
modifies the target document, improving thread safety.
xmlParseInNodeContext is also more lenient now with regard to undeclared
entities.
Fixes  #727 . 
						
						
					 
					
						2024-07-11 01:26:32 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						2e63656ec6 
					 
					
						
						
							
							parser: Check return value of inputPush  
						
						... 
						
						
						
						inputPush typically doesn't fail because we pre-allocate the input
table. The return value should be checked nevertheless. 
						
						
					 
					
						2024-07-08 11:27:52 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						fdfeecfe5e 
					 
					
						
						
							
							parser: Reenable ctxt->directory  
						
						... 
						
						
						
						Unused internally, but used in downstream code.
Should fix  #753 . 
						
						
					 
					
						2024-07-02 22:06:53 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						30ef77554b 
					 
					
						
						
							
							parser: Don't use deprecated xmlCopyChar  
						
						
						
						
					 
					
						2024-07-02 13:34:11 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						dd8e378513 
					 
					
						
						
							
							HTML: Rework UTF8ToHtml  
						
						... 
						
						
						
						Optimize code. Check for XML_ENC_ERR_SPACE. Use error macros. 
						
						
					 
					
						2024-07-01 18:05:40 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						f505dcaea0 
					 
					
						
						
							
							tree: Remove underscores from xmlRegisterCallbacks  
						
						
						
						
					 
					
						2024-06-27 14:45:35 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						1112699cfa 
					 
					
						
						
							
							legacy: Remove most legacy functions from public headers  
						
						... 
						
						
						
						Also remove warning messages. 
						
						
					 
					
						2024-06-17 15:47:42 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						039ce1e821 
					 
					
						
						
							
							parser: Pass global object to sax->setDocumentLocator  
						
						... 
						
						
						
						Revert part of commit c011e760Fixes  #732 . 
						
						
					 
					
						2024-06-14 16:41:43 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						89fcae4dfd 
					 
					
						
						
							
							parser: Don't report malloc failures when creating context  
						
						... 
						
						
						
						We don't want messages to stderr before an error handler could be set on
a parser context. 
						
						
					 
					
						2024-06-12 16:36:12 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						e75e878e02 
					 
					
						
						
							
							doc: Update and fix documentation  
						
						
						
						
					 
					
						2024-05-20 14:23:39 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						a4c2b7233f 
					 
					
						
						
							
							io: Don't set close callback in xmlParserInputBufferCreateFd  
						
						
						
						
					 
					
						2024-05-05 17:27:12 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						05654cfe00 
					 
					
						
						
							
							html: Deprecate htmlHandleOmittedElem  
						
						
						
						
					 
					
						2024-04-28 18:58:27 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						aa04838eab 
					 
					
						
						
							
							html: Use binary search in htmlEntityValueLookup  
						
						
						
						
					 
					
						2024-03-26 14:21:11 +01:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						3efbe916a1 
					 
					
						
						
							
							parser: Mark 'token' member as unused in xmlParserCtxt  
						
						
						
						
					 
					
						2024-01-05 20:39:40 +01:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						b82fd81d06 
					 
					
						
						
							
							parser: Rework xmlCtxtParseDocument  
						
						... 
						
						
						
						Make xmlCtxtParseDocument take a parser input which can be popped after
parsing. 
						
						
					 
					
						2024-01-05 20:39:40 +01:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						7e0bbbc143 
					 
					
						
						
							
							parser: New input API  
						
						... 
						
						
						
						Provide a new set of functions to create xmlParserInputs. These can be
used for the document entity or from external entity loaders.
- Don't require xmlParserInputBuffer.
- All functions take a base URI.
- All functions take an encoding as string.
- xmlNewInputURL also takes a public ID.
- xmlNewInputMemory takes a size_t.
- Optimization hints for memory buffers.
Improve documentation.
Only call xmlInitParser before allocating a new parser context.
Call xmlCtxtUseOptions as early as possible. 
						
						
					 
					
						2023-12-29 01:22:13 +01:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						6a9a88a17f 
					 
					
						
						
							
							parser: Move progressive flag into input struct  
						
						
						
						
					 
					
						2023-12-29 01:20:08 +01:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						d944a41515 
					 
					
						
						
							
							parser: Fix in-parameter-entity and in-external-dtd checks  
						
						... 
						
						
						
						Use in ctxt->input->entity instead of ctxt->inputNr to determine whether
we are inside a parameter entity.
Stop using ctxt->external to check whether we're in an external DTD.
This is signaled by ctxt->inSubset == 2. 
						
						
					 
					
						2023-12-29 01:19:56 +01:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						477a7ed82c 
					 
					
						
						
							
							html: Abort earlier on fatal errors  
						
						
						
						
					 
					
						2023-12-28 19:43:48 +01:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						c1bddd4c26 
					 
					
						
						
							
							parser: Mark 'length' member of xmlParserInput as unused  
						
						
						
						
					 
					
						2023-12-25 23:38:40 +01:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						955c177f69 
					 
					
						
						
							
							parser: Stop using 'directory' struct member  
						
						... 
						
						
						
						This was only used as a pointless fallback for URI resolution. 
						
						
					 
					
						2023-12-25 23:38:40 +01:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						8cd563174a 
					 
					
						
						
							
							html: Don't close fd in htmlCtxtReadFd  
						
						... 
						
						
						
						Long-standing bug. The XML fix from 2003 was never ported to the HTML
parser. htmlReadFd was fixed with fe6890e2 
						
						
					 
					
						2023-12-21 15:02:24 +01:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						130436917c 
					 
					
						
						
							
							parser: Rename xmlErrParser to xmlCtxtErr  
						
						
						
						
					 
					
						2023-12-21 15:02:24 +01:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						8d0aaf4b95 
					 
					
						
						
							
							parser: Remove xmlErrEncoding  
						
						... 
						
						
						
						Use xmlFatalErr or xmlCtxtErrIO. 
						
						
					 
					
						2023-12-21 15:02:24 +01:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						54c70ed57f 
					 
					
						
						
							
							parser: Improve error handling  
						
						... 
						
						
						
						Introduce xmlCtxtSetErrorHandler allowing to set a structured error for
a parser context. There already was the "serror" SAX handler but this
always receives the parser context as argument.
Start to use xmlRaiseMemoryError.
Remove useless arguments from memory error functions. Rename
xmlErrMemory to xmlCtxtErrMemory.
Remove a few calls to xmlGenericError.
Remove support for runtime entity debugging. 
						
						
					 
					
						2023-12-21 02:46:27 +01:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						c2bbeed1fd 
					 
					
						
						
							
							io: Fix memory lifetime issue with input buffers  
						
						... 
						
						
						
						xmlParserInputBufferCreateMem must make a copy of the buffer.
This fixes a regression from 2.11 which could cause reads from freed
memory depending on the use case.
Undeprecate xmlParserInputBufferCreateStatic which can avoid copying
the whole buffer. 
						
						
					 
					
						2023-12-12 23:51:32 +01:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						abd74186f9 
					 
					
						
						
							
							html: Report malloc failures  
						
						... 
						
						
						
						Fix many places where malloc failures aren't reported.
Stop checking for ctxt->instate. 
						
						
					 
					
						2023-12-11 22:13:06 +01:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						c011e7605d 
					 
					
						
						
							
							globals: Remove unused globals from thread storage  
						
						... 
						
						
						
						Setting these deprecated globals hasn't had an effect for a long time.
Make them constants. This reduces the size of per-thread storage from
~700 to ~250 bytes. 
						
						
					 
					
						2023-12-06 20:07:54 +01:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						c7629c9eb1 
					 
					
						
						
							
							parser: Clarify documentation regarding xmlReadMemory buffer size  
						
						... 
						
						
						
						Fixes  #638 . 
					
						2023-11-30 16:52:34 +01:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						e395946194 
					 
					
						
						
							
							html: Reenable buggy detection of XML declarations  
						
						... 
						
						
						
						Switch to UTF-8 if a document starts with '<?xm' to match old behavior.
Also enable this check in the push parser.
Fixes  #637 . 
						
						
					 
					
						2023-11-30 16:22:59 +01:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						ff6c318862 
					 
					
						
						
							
							include: Remove useless 'const' from function arguments  
						
						
						
						
					 
					
						2023-11-23 15:27:00 +01:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						b9db3d7d02 
					 
					
						
						
							
							parser: Simplify xmlStringCurrentChar  
						
						... 
						
						
						
						Start to move away from using this function. 
						
						
					 
					
						2023-09-22 19:01:11 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						8c084ebdc7 
					 
					
						
						
							
							doc: Make apibuild.py happy  
						
						
						
						
					 
					
						2023-09-21 22:57:33 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						c5890716a6 
					 
					
						
						
							
							html: Fix logic in htmlAutoClose  
						
						... 
						
						
						
						Note that the function is never called with a NULL newtag.
Fixes  #591 . 
						
						
					 
					
						2023-09-21 17:01:35 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						9b5cce7a71 
					 
					
						
						
							
							include: Remove more unnecessary includes  
						
						
						
						
					 
					
						2023-09-21 01:50:53 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						11a1839ddd 
					 
					
						
						
							
							globals: Move remaining globals back to correct header files  
						
						... 
						
						
						
						This undoes a lot of damage. 
						
						
					 
					
						2023-09-20 22:06:49 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						4e1c13ebfd 
					 
					
						
						
							
							debug: Remove debugging code  
						
						... 
						
						
						
						This is barely useful these days and only clutters the code base. 
						
						
					 
					
						2023-09-19 17:35:09 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						e48f2695fe 
					 
					
						
						
							
							parser: Remove push parser debugging code  
						
						
						
						
					 
					
						2023-08-29 18:17:09 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						0d24fc0a47 
					 
					
						
						
							
							html: Remove encoding hack in htmlCreateFileParserCtxt  
						
						... 
						
						
						
						Switch encoding directly instead of calling htmlCheckEncoding with faked
content. 
						
						
					 
					
						2023-08-14 12:53:49 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						5db5a704eb 
					 
					
						
						
							
							html: Fix UAF in htmlCurrentChar  
						
						... 
						
						
						
						Short-lived regression found by OSS-Fuzz. 
						
						
					 
					
						2023-08-09 18:40:25 +02:00