For https://bugzilla.gnome.org/show_bug.cgi?id=681822
Regardless if the option HTML_PARSE_NOBLANKS is set or not, blank nodes
are removed from a HTML document, for example:
<html>
<head>
<title>This is a test.</title>
</head>
<body>
<p>This is a test.</p>
</body>
</html>
is read as:
<html><head><title>This is a test.</title></head><body>
<p>This is a test.</p>
</body></html>
This changes the default behaviour but the old behaviour is available
as expected when using the parser flag HTML_PARSE_NOBLANKS
Based on original patch from Igor Ignatyuk <igor_ignatiouk@hotmail.com>
* HTMLparser.c: change various places in the parser where ignorable_space
SAX callback was called without checking for the parser flag preference
* xmllint.c: make sure we use the new flag even for HTML parsing
* result/HTML/*: this modifies the output of a number of tests
* HTMLparser.c: Applied the last patch from Gary Coady for #304637
changing the behaviour when text nodes are found in body
* result/HTML/*: this changes the output of some tests
Daniel
* HTMLtree.c: fixed bug #310333 with a patch close to the provided
patch for HTML UTF-8 serialization
* result/HTML/script2.html: this changed the output of that test
Daniel
* HTMLparser.c: applied UTF-8 script parsing bug #310229 fix from
Jiri Netolicky
* result/HTML/script2.html* test/HTML/script2.html: added the test
case from the regression suite
Daniel