Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						53c131f667 
					 
					
						
						
							
							doc: Make apibuild.py work again  
						
						
						
						
					 
					
						2024-12-26 20:29:58 +01:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						0447275ef8 
					 
					
						
						
							
							html: Check reallocations for overflow  
						
						
						
						
					 
					
						2024-12-21 19:37:37 +01:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						6548ba11b8 
					 
					
						
						
							
							parser: Fix argument checks in xmlCtxtParse*  
						
						... 
						
						
						
						- Raise invalid argument error.
- Free input stream if ctxt is NULL. 
						
						
					 
					
						2024-12-13 17:57:11 +01:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						497081baab 
					 
					
						
						
							
							parser: Remove remaining calls to xml{Push|Pop}Input  
						
						
						
						
					 
					
						2024-11-19 00:25:23 +01:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						0f4f89005d 
					 
					
						
						
							
							parser: Rename inputPush to xmlCtxtPushInput  
						
						
						
						
					 
					
						2024-11-19 00:25:23 +01:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						225ed70737 
					 
					
						
						
							
							html: Accelerate htmlParseCharData  
						
						
						
						
					 
					
						2024-10-06 20:04:00 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						207999793f 
					 
					
						
						
							
							html: Handle numeric character references directly  
						
						
						
						
					 
					
						2024-10-06 20:04:00 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						0bc4608c50 
					 
					
						
						
							
							html: Use hash table to check for duplicate attributes  
						
						
						
						
					 
					
						2024-10-06 20:04:00 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						24a6149fc4 
					 
					
						
						
							
							html: Make sure that character data mode is reset  
						
						
						
						
					 
					
						2024-10-06 20:04:00 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						c32397d51f 
					 
					
						
						
							
							html: Improve character class macros  
						
						
						
						
					 
					
						2024-10-06 20:04:00 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						e840655414 
					 
					
						
						
							
							html: Rewrite parsing of most data  
						
						
						
						
					 
					
						2024-10-06 20:04:00 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						f77ec16db0 
					 
					
						
						
							
							html: Optimize htmlParseCharData  
						
						
						
						
					 
					
						2024-10-06 20:04:00 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						440bd64c69 
					 
					
						
						
							
							html: Optimize htmlParseHTMLName  
						
						
						
						
					 
					
						2024-10-06 20:04:00 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						6040785ac4 
					 
					
						
						
							
							html: Deprecate AutoClose API  
						
						
						
						
					 
					
						2024-10-06 20:04:00 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						188cad68a4 
					 
					
						
						
							
							html: Remove obsolete content model  
						
						
						
						
					 
					
						2024-10-06 20:04:00 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						0144f662d7 
					 
					
						
						
							
							html: Remove obsolete code  
						
						
						
						
					 
					
						2024-10-06 20:04:00 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						575be6c1f1 
					 
					
						
						
							
							html: Fix line numbers with CRs  
						
						
						
						
					 
					
						2024-10-06 20:04:00 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						be874d7831 
					 
					
						
						
							
							html: Ignore unexpected DOCTYPE declarations  
						
						
						
						
					 
					
						2024-10-06 20:04:00 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						462bf0b7a5 
					 
					
						
						
							
							html: Rework options  
						
						... 
						
						
						
						Introduce htmlCtxtSetOptions, see similar changes made to XML parser.
Add HTML_PARSE_HUGE alias. Support HTML_PARSE_BIG_LINES. 
						
						
					 
					
						2024-10-06 20:04:00 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						42c3823df0 
					 
					
						
						
							
							html: Update comment  
						
						
						
						
					 
					
						2024-10-06 20:04:00 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						9f04cce695 
					 
					
						
						
							
							html: Remove unused or useless return codes  
						
						... 
						
						
						
						htmlParseStartTag should always succeed (except for malloc failures). 
						
						
					 
					
						2024-10-06 20:04:00 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						e179f3ec0e 
					 
					
						
						
							
							html: Stop reporting syntax errors  
						
						... 
						
						
						
						It doesn't make much sense to keep the old syntax error handling which
doesn't conform to HTML5.
Handling HTML5 parser errors is rather involved and not essential for
parsers. 
						
						
					 
					
						2024-10-06 20:04:00 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						27752f75ca 
					 
					
						
						
							
							html: Fix EOF handling in start tags  
						
						
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						b19d353970 
					 
					
						
						
							
							html: Fix EOF handling in comments  
						
						
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						17e56ac54a 
					 
					
						
						
							
							html: Fix parsing of end tags  
						
						
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						24a09033c9 
					 
					
						
						
							
							html: Fix bogus end tags  
						
						
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						bca6485476 
					 
					
						
						
							
							html: Allow U+000C FORM FEED as whitespace  
						
						
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						6edf1a645e 
					 
					
						
						
							
							html: Fix DOCTYPE parsing  
						
						
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						9678163f54 
					 
					
						
						
							
							html: Don't check for valid XML characters  
						
						
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						a6955c13c7 
					 
					
						
						
							
							html: Parse numeric character references according to HTML5  
						
						
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						4eeac30944 
					 
					
						
						
							
							html: Start to fix EOF and U+0000 handling  
						
						
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						e062a4a9b3 
					 
					
						
						
							
							html: Add HTML5 parser option  
						
						... 
						
						
						
						This option passes tokenizer output directly to the SAX callbacks,
making it possible to test the tokenizer against the html5lib test
suite.
This will produce unbalanced calls to the startElement and endElement
callbacks, but it's the only way to support a SAX like interface for
HTML5. It can be used for filtering or rewriting HTML5, for example.
A HTML5 tree builder could then be implemented on top of the SAX
callbacks. 
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						17da54c522 
					 
					
						
						
							
							html: Normalize newlines  
						
						
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						341dc78f24 
					 
					
						
						
							
							html: Deduplicate code in htmlCurrentChar  
						
						
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						3adb396d87 
					 
					
						
						
							
							html: Parse bogus comments instead of ignoring them  
						
						... 
						
						
						
						Also treat XML processing instructions as bogus comments. 
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						8444017578 
					 
					
						
						
							
							html: Add missing calls to htmlCheckParagraph()  
						
						
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						86d6b9b051 
					 
					
						
						
							
							html: Deduplicate some code  
						
						
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						0d324bde36 
					 
					
						
						
							
							html: Simplify node info accounting  
						
						
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						ccb61f599e 
					 
					
						
						
							
							html: Remove duplicate calls to htmlAutoClose  
						
						
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						f9ed30e972 
					 
					
						
						
							
							html: HTML5 character data states  
						
						
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						5951179239 
					 
					
						
						
							
							html: Parse named character references according to HTML5  
						
						
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						d5cd0f07f8 
					 
					
						
						
							
							html: Prefer SKIP(1) over NEXT in HTML parser  
						
						... 
						
						
						
						Use SKIP(1) where it's safe to avoid a function call. 
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						dc2d498318 
					 
					
						
						
							
							html: Rework htmlLookupSequence  
						
						... 
						
						
						
						Rename to htmlLookupString and use strstr for increased performance. 
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						637215a4de 
					 
					
						
						
							
							html: Always terminate doctype declarations on '>'  
						
						... 
						
						
						
						Align with HTML5 spec. This allows to remove the old quote handling in
htmlLookupSequence. 
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						72e29f9a3d 
					 
					
						
						
							
							html: Fix quadratic behavior in push parser  
						
						... 
						
						
						
						Fix quadratic behavior related to unquoted attribute values. We really
have to replicate parts of the HTML5 state machine to find the end of
tags relibably.
Fixes  #533 . 
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						a80f8b64a9 
					 
					
						
						
							
							html: Allow attributes in end tags  
						
						... 
						
						
						
						Attribute are syntactically allowed in HTML5 end tags but otherwise
ignored. 
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						f2272c231b 
					 
					
						
						
							
							html: Handle unexpected-solidus-in-tag according to HTML5  
						
						
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						939b53ee12 
					 
					
						
						
							
							html: Stop skipping tag content  
						
						... 
						
						
						
						Tag and attributes names should always be parsed succesfully now. 
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						dcb2abb2fe 
					 
					
						
						
							
							html: Parse tag and attribute names according to HTML5  
						
						... 
						
						
						
						HTML5 allows bascially all characters in tag and attribute names. 
						
						
					 
					
						2024-10-06 18:13:05 +02:00 
						 
				 
			
				
					
						
							
							
								Nick Wellnhofer 
							
						 
					 
					
						
						
							
						
						5d36664fc9 
					 
					
						
						
							
							memory: Deprecate xmlGcMemSetup  
						
						
						
						
					 
					
						2024-07-16 17:42:10 +02:00