1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2025-10-26 00:37:43 +03:00
Commit Graph

957 Commits

Author SHA1 Message Date
Dennis Filder
7e9bbdf82f parser bug on misformed namespace attributes
For https://bugzilla.gnome.org/show_bug.cgi?id=672539
Reported by Axel Miller <axel.miller@ppi.de>

Consider the following start-tag:
<x xmlns=""version="">

The start-tag does not conform to the rule

[40]       STag       ::=       '<' Name (S Attribute)* S? '>'

since there is no whitespace in front of the attribute "version".

Thus, libxml2 should reject the start-tag.
But it doesn't:

$ echo '<x xmlns=""version=""/>' | xmllint -
<?xml version="1.0"?>
<x xmlns="" version=""/>

The error seems to happen only if there is a namespace declaration in
front of
the attribute. A missing whitespace between other attributes is handled
correctly:

$ echo '<x someattr=""version=""/>' | xmllint -
-:1: parser error : attributes construct error
<x someattr=""version=""/>
              ^
[...]
2014-10-06 20:34:14 +08:00
Juergen Keil
24fb4c329a wrong error column in structured error when parsing end tag
For https://bugzilla.gnome.org/show_bug.cgi?id=734283

libxml2 reports wrong error column numbers (field int2 in xmlError)
in structured error handler, after parsing an end tag.
2014-10-06 18:19:12 +08:00
Juergen Keil
33f658c969 wrong error column in structured error when parsing attribute values
For https://bugzilla.gnome.org/show_bug.cgi?id=734280

libxml2 reports wrong error column numbers (field int2 in xmlError)
in structured error handler, after parsing XML attribute values.

Example XML:

<?xml version="1.0" encoding="UTF-8"?>
<root
xmlns="urn:colbug">&</root>
<!--
         1         2         3         4
1234567890123456789012345678901234567890
-->

Expected location of the error would be line 3, column 21.

The actual location of the error is line 3, column 9:

$ ./xmlparse colbug2.xml
colbug2.xml:3:9: xmlParseEntityRef: no name

The 12 characters of the xmlns attribute value "urn:colbug" are
not accounted for in the error column value.
2014-08-07 17:30:36 +08:00
Juergen Keil
5d4310af45 wrong error column in structured error when skipping whitespace in xml decl
For https://bugzilla.gnome.org/show_bug.cgi?id=734276

libxml2 reports wrong error column numbers (field int2 in xmlError)
in structured error handler, after an XML declaration containing
whitespace.

Example XML:

<?xml  version="1.0"  encoding="UTF-8"     ?><root>&</root>
<!--
         1         2         3         4         5         6
123456789012345678901234567890123456789012345678901234567890
-->

Expected location of the error would be line 1, column 53.

The actual location of the error is line 1, column 44:

$ ./xmlparse colbug1.xml
colbug1.xml:1:44: xmlParseEntityRef: no name
2014-08-07 16:28:09 +08:00
Daniel Veillard
2f9b126a5c typo in error messages "colon are forbidden from..."
For https://bugzilla.gnome.org/show_bug.cgi?id=731511
Pointed byt vincent Lefevre
2014-07-26 20:29:36 +08:00
Daniel Veillard
c836ba66e5 Fix a potential NULL dereference
For https://bugzilla.gnome.org/show_bug.cgi?id=733040

xmlDictLookup() may return NULL in case of allocation error,
though very unlikely it need to be checked.
2014-07-14 16:39:50 +08:00
Daniel Veillard
dd8367da17 Fix regressions introduced by CVE-2014-0191 patch
A number of issues have been raised after the fix, and this patch
tries to correct all of them, though most were related to
postvalidation.
https://bugzilla.gnome.org/show_bug.cgi?id=730290
and other reports on list, off-list and on Red Hat bugzilla
2014-06-11 17:00:39 +08:00
Daniel Veillard
9cd1c3cfbd Do not fetch external parameter entities
Unless explicitely asked for when validating or replacing entities
with their value. Problem pointed out by Daniel Berrange <berrange@redhat.com>
2014-05-06 22:31:51 +08:00
Daniel Veillard
6faa126fc3 Fix xmlParseInNodeContext() if node is not element
We really need to have ctxt->instate == XML_PARSER_CONTENT when
jumping in content parsing
Bug reported by Frank Gross
2014-03-21 17:05:51 +08:00
Longstreth Jon
190a0b8939 Fix a portability issue on Windows
Apparently an verflow when comparing macro and unsigned long
2014-02-06 10:58:17 +01:00
Daniel Veillard
054c716ea1 Missing initialization for the catalog module 2014-01-26 15:02:25 +01:00
Daniel Veillard
4e1476c5ea adding init calls to xml and html Read parsing entry points
As pointed out by "Tassyns, Bram <BramT@enfocus.com>" on the list
some call had it other didn't, clean it up and add to all missing
ones
2013-12-09 15:23:40 +08:00
Jan Pokorný
9a85d40cef Fix incorrect spelling entites->entities
Partially, a follow-up of 81d7a8245c.

Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
2013-11-30 20:03:52 +08:00
Daniel Veillard
dcc1950319 Fix a parsing bug on non-ascii element and CR/LF usage
https://bugzilla.gnome.org/show_bug.cgi?id=698550

Somehow the behaviour of the internal parser routine changed
slightly when encountering CR/LF, which led to a bug when
parsing document with non-ascii Names
2013-05-22 22:56:45 +02:00
Daniel Veillard
63588f476f Fix a regression in xmlGetDocCompressMode()
The switch to xzlib had for consequence that the compression
level of the input was not gathered anymore in ctxt->input->buf,
then the parser compression flags was left to -1 and propagated
to the resulting document.
Fix the I/O layer to get compression detection in xzlib,
then carry it in the input buffer and the resulting document

  This should fix
    https://lsbbugs.linuxfoundation.org/show_bug.cgi?id=3456
2013-05-10 14:01:46 +08:00
Nikolay Sivov
d4a5d98139 Cast encoding name to char pointer to match arg type 2013-05-06 09:00:56 +08:00
Alexander Pastukhov
704d8c5e9a Fix an error in xmlCleanupParser
https://bugzilla.gnome.org/show_bug.cgi?id=698582

xmlCleanupParser calls xmlCleanupGlobals() and then
xmlResetLastError() but the later reallocate the global
data freed by previous call. Just swap the two calls.
2013-04-23 13:02:11 +08:00
Jüri Aedla
9ca816b3a6 Fix a couple of return without value
Error introduced in previous commit !
2013-04-16 22:00:13 +08:00
Daniel Veillard
e50ba8164e Improve handling of xmlStopParser()
Add a specific parser error
Try to stop parsing as quickly as possible
2013-04-11 15:54:51 +08:00
Daniel Veillard
cff2546f13 Cache presence of '<' in entities content
slightly modify how ent->checked is used, and use the lowest bit to
keep the information
2013-03-11 15:59:22 +08:00
Daniel Veillard
a3f1e3e571 Avoid extra processing on entities
If an entity has already been checked for correctness no
need to check it on every reference
2013-03-11 15:59:21 +08:00
Daniel Veillard
23f05e0c33 Detect excessive entities expansion upon replacement
If entities expansion in the XML parser is asked for,
it is possble to craft relatively small input document leading
to excessive on-the-fly content generation.
This patch accounts for those replacement and stop parsing
after a given threshold. it can be bypassed as usual with the
HUGE parser option.
2013-02-19 10:21:49 +08:00
Daniel Veillard
bf058dce13 Fix the flushing out of raw buffers on encoding conversions
https://bugzilla.gnome.org/show_bug.cgi?id=692915

the new set of converting functions tried to limit the encoding
conversion of the raw buffer to the consumption one to work in
a more progressive fashion. Unfortunately this was bad for
performances and led to errors on progressive parsing when
a very large chunk was close to the end of the document. Fix
the new internal function and switch back to the old way of
converting. Fix another bug in the process.
2013-02-13 18:19:42 +08:00
Daniel Veillard
de0cc20c29 Fix some buffer conversion issues
https://bugzilla.gnome.org/show_bug.cgi?id=690202

Buffer overflow errors originating from xmlBufGetInputBase in 2.9.0
The pointers from the context input were not properly reset after
that call which can do reallocations.
2013-02-12 16:55:34 +08:00
Patrick Gansterer
9c8eaabe83 Fix compiler warning after 153cf15905
Add missing cast for xmlNop to silence a compiler warning.
2013-01-04 19:48:55 +08:00
Dan Winship
cf8f0424db Fix an error in the progressive DTD parsing code
For https://bugzilla.gnome.org/show_bug.cgi?id=689958
We were looking for the wrong character in the input stream
2012-12-21 11:13:31 +08:00
Michael Wood
fb27e2cd20 Fix spelling of "length". 2012-10-30 10:18:49 +08:00
Daniel Veillard
6a36fbe3b3 Fix potential out of bound access 2012-10-29 10:39:55 +08:00
Daniel Veillard
153cf15905 Fix large parse of file from memory
https://bugzilla.redhat.com/show_bug.cgi?id=862969
The new code trying to detect excessive input lookup would
just get wrong sometimes in the case of very large file parsed
directly from memory.
2012-10-26 13:50:47 +08:00
Daniel Veillard
711b15d545 Fix a bug in the nsclean option of the parser
Raised as a side effect of:
https://bugzilla.gnome.org/show_bug.cgi?id=663844
2012-10-25 19:23:26 +08:00
Daniel Veillard
6c91aa384f Fix a regression in 2.9.0 breaking validation while streaming
https://bugzilla.gnome.org/show_bug.cgi?id=684774
with help from Kjell Ahlstedt <kjell.ahlstedt@bredband.net>
2012-10-25 15:33:59 +08:00
Jan Pokorný
81d7a8245c Fix typos in parser comments
Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
2012-09-13 22:40:28 +08:00
Daniel Veillard
f8e3db0445 Big space and tab cleanup
Remove all space before tabs and space and tabs at end of lines.
2012-09-11 13:26:36 +08:00
Daniel Veillard
28f5e1a2d6 Fix potential crash on entities errors
Related to https://bugs.launchpad.net/lxml/+bug/502959

Basically the core of the issue is that if an entity references another
entity, then in case we are replacing entities content, we should always
do so by copying the referenced content as long as the reference is
done within the entity. Otherwise, if for some reason there is a later
parsing error that entity content may be freed.

Complex scenario exposed by command:
thinkpad:~/XML/diveintopython-5.4/xml -> valgrind --db-attach=yes
../../xmllint --loaddtd --noout --noent diveintopython.xml

  Document references &a;
  a references &b;
  we references b content directly in by linking in the a content
  a has an error further down
  we free a, freeing the chunk from b
  Document references &b; after &a;
  we try to copy b content, but it was freed already => segfault

* parser.c: never reference directly entity content without copying if
  we aren't in the document main entity
2012-09-04 11:18:39 +08:00
Daniel Veillard
1f972e9f28 Cleanup some of the parser code
Prefetching assumptions about the amount of data read in GROW
should be backed up with test for 0 termination when at the
end of the buffer.
2012-08-15 10:16:37 +08:00
Daniel Veillard
968a03a2e5 Add support for big line numbers in error reporting
Fix the lack of line number as reported by Johan Corveleyn <jcorvel@gmail.com>

* parser.c include/libxml/parser.h: add an XML_PARSE_BIG_LINES parser
  option not switch on by default, it's an opt-in
* SAX2.c: if XML_PARSE_BIG_LINES is set store the long line numbers
  in the psvi field of text nodes
* tree.c: expand xmlGetLineNo to extract those informations, also
  make sure we can't fail on recursive behaviour
* error.c: in __xmlRaiseError, if a node is provided, call
  xmlGetLineNo() if we can't get a valid line number.
* xmllint.c: switch on XML_PARSE_BIG_LINES in xmllint
2012-08-13 12:41:33 +08:00
Daniel Veillard
5353bbf7dd More fixups on the push parser behaviour 2012-08-03 12:03:31 +08:00
Daniel Veillard
2b52aa0050 Strengthen behaviour of the push parser in problematic situations
Implement the maximum lookahead stategy, and fix some handling
of DTD to speed up processing.
2012-07-31 10:53:47 +08:00
Daniel Veillard
e7bf892d8c Improve error reporting on parser errors
The extra string was being dismissed when provided.
* parser.c: handle bot case properly
* result/: this changes a few error reports
2012-07-30 20:09:25 +08:00
Daniel Veillard
48b4cdde34 Enforce XML_PARSER_EOF state handling through the parser
That condition is one raised when the parser should positively stop
processing further even to report errors. Best is to test is after
most GROW call especially within loops
2012-07-30 16:16:04 +08:00
Daniel Veillard
0df83cae70 Fixup limits parser 2012-07-30 15:41:10 +08:00
Daniel Veillard
52d8ade7a7 Introduce some default parser limits
Those can be overrided by the XML_PARSE_HUGE option, they
are just default limits for Name lenght, dictionary size limits
and maximum amount of parser lookup.
* include/libxml/parserInternals.h: define the limits
* include/libxml/xmlerror.h: add a new error
* parser.c parserInternals.c: implements the new limits
2012-07-30 10:08:45 +08:00
Daniel Veillard
f572a78d58 More avoid quadratic behaviour 2012-07-23 14:24:28 +08:00
Daniel Veillard
5130481646 Impose a reasonable limit on PI size
Unless the XML_PARSE_HUGE option is given to the parser,
the value is XML_MAX_TEXT_LENGTH, i.e. the same than for a
text node within content.
Also cleanup some unsigned int used for memory size.
2012-07-23 14:24:28 +08:00
Daniel Veillard
6568645164 Avoid quadratic behaviour in some push parsing cases
avoid rescanning over and over a very long input, just check
the incoming chunks
2012-07-23 14:24:28 +08:00
Daniel Veillard
58f73aca1a Impose a reasonable limit on comment size
Unless the XML_PARSE_HUGE option is given to the parser,
the value is XML_MAX_TEXT_LENGTH, i.e. the same than for a
text node within content.
Also cleanup some unsigned int used for memory size.
2012-07-23 14:24:28 +08:00
Daniel Veillard
e17db9946c Impose a reasonable limit on attribute size
Unless the XML_PARSE_HUGE option is given to the parser,
the value is XML_MAX_TEXT_LENGTH, i.e. the same than for a
text node within content.
2012-07-23 14:24:27 +08:00
Daniel Veillard
00ac0d3b96 More cleanups for input/buffers code
When calling xmlParserInputBufferPush, the buffer may be reallocated
and at the input level the pointers for base, cur and end need to
be reevaluated.
* buf.c buf.h: add two new functions, one to get the base from the
  input of the buffer, and another one to reset the pointers based
  on the cur and base inded
* HTMLparser.c parser.c: cleanup to use the new helper functions
  as well as making sure size_t is used for the indexes computations
2012-07-23 14:24:27 +08:00
Daniel Veillard
61551a1eb7 Cleanup function xmlBufResetInput() to set input from Buffer
This was scattered in a number of modules, xmlParserInputPtr
have usually their base, cur and end pointer set from an
xmlBuf used as input.
* buf.c buf.h: add a new function implementing this setup
* parser.c HTMLparser.c catalog.c parserInternals.c xmlreader.c
  use the new function instead of digging into the buffer in
  all those modules
2012-07-23 14:24:27 +08:00
Daniel Veillard
768eb3b82d Convert XML parser to the new input buffers
The main changes are when the internal of the buffers structure
were adressed directly, we now use routines coming from buf.h
The routine xmlParserInputRead() which wasn't used anywhere is
deprecated too.
2012-07-23 14:24:26 +08:00