mirror of
https://github.com/MariaDB/server.git
synced 2025-07-27 18:02:13 +03:00
8.38
This commit is contained in:
@ -329,7 +329,8 @@ A second use of backslash provides a way of encoding non-printing characters
|
||||
in patterns in a visible manner. There is no restriction on the appearance of
|
||||
non-printing characters, apart from the binary zero that terminates a pattern,
|
||||
but when a pattern is being prepared by text editing, it is often easier to use
|
||||
one of the following escape sequences than the binary character it represents:
|
||||
one of the following escape sequences than the binary character it represents.
|
||||
In an ASCII or Unicode environment, these escapes are as follows:
|
||||
<pre>
|
||||
\a alarm, that is, the BEL character (hex 07)
|
||||
\cx "control-x", where x is any ASCII character
|
||||
@ -353,19 +354,33 @@ data item (byte or 16-bit value) following \c has a value greater than 127, a
|
||||
compile-time error occurs. This locks out non-ASCII characters in all modes.
|
||||
</P>
|
||||
<P>
|
||||
The \c facility was designed for use with ASCII characters, but with the
|
||||
extension to Unicode it is even less useful than it once was. It is, however,
|
||||
recognized when PCRE is compiled in EBCDIC mode, where data items are always
|
||||
bytes. In this mode, all values are valid after \c. If the next character is a
|
||||
lower case letter, it is converted to upper case. Then the 0xc0 bits of the
|
||||
byte are inverted. Thus \cA becomes hex 01, as in ASCII (A is C1), but because
|
||||
the EBCDIC letters are disjoint, \cZ becomes hex 29 (Z is E9), and other
|
||||
characters also generate different values.
|
||||
When PCRE is compiled in EBCDIC mode, \a, \e, \f, \n, \r, and \t
|
||||
generate the appropriate EBCDIC code values. The \c escape is processed
|
||||
as specified for Perl in the <b>perlebcdic</b> document. The only characters
|
||||
that are allowed after \c are A-Z, a-z, or one of @, [, \, ], ^, _, or ?. Any
|
||||
other character provokes a compile-time error. The sequence \@ encodes
|
||||
character code 0; the letters (in either case) encode characters 1-26 (hex 01
|
||||
to hex 1A); [, \, ], ^, and _ encode characters 27-31 (hex 1B to hex 1F), and
|
||||
\? becomes either 255 (hex FF) or 95 (hex 5F).
|
||||
</P>
|
||||
<P>
|
||||
Thus, apart from \?, these escapes generate the same character code values as
|
||||
they do in an ASCII environment, though the meanings of the values mostly
|
||||
differ. For example, \G always generates code value 7, which is BEL in ASCII
|
||||
but DEL in EBCDIC.
|
||||
</P>
|
||||
<P>
|
||||
The sequence \? generates DEL (127, hex 7F) in an ASCII environment, but
|
||||
because 127 is not a control character in EBCDIC, Perl makes it generate the
|
||||
APC character. Unfortunately, there are several variants of EBCDIC. In most of
|
||||
them the APC character has the value 255 (hex FF), but in the one Perl calls
|
||||
POSIX-BC its value is 95 (hex 5F). If certain other characters have POSIX-BC
|
||||
values, PCRE makes \? generate 95; otherwise it generates 255.
|
||||
</P>
|
||||
<P>
|
||||
After \0 up to two further octal digits are read. If there are fewer than two
|
||||
digits, just those that are present are used. Thus the sequence \0\x\07
|
||||
specifies two binary zeros followed by a BEL character (code value 7). Make
|
||||
digits, just those that are present are used. Thus the sequence \0\x\015
|
||||
specifies two binary zeros followed by a CR character (code value 13). Make
|
||||
sure you supply two digits after the initial zero if the pattern character that
|
||||
follows is itself an octal digit.
|
||||
</P>
|
||||
@ -3249,9 +3264,9 @@ Cambridge CB2 3QH, England.
|
||||
</P>
|
||||
<br><a name="SEC30" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 08 January 2014
|
||||
Last updated: 14 June 2015
|
||||
<br>
|
||||
Copyright © 1997-2014 University of Cambridge.
|
||||
Copyright © 1997-2015 University of Cambridge.
|
||||
<br>
|
||||
<p>
|
||||
Return to the <a href="index.html">PCRE index page</a>.
|
||||
|
Reference in New Issue
Block a user