optimization seems unlikely to ever be useful, but if it occurs in
real code it should be added to GCC. Expanding strcmp of small strings
does appear useful (benchmarking shows it is 2-3x faster), so this would
be useful to implement in GCC (PR 78809).
* string/bits/string2.h (strcmp): Remove define.
(__strcmp_cg): Likewise.
(strncmp): Likewise.
This is transformed into rawmemchr by the bits/string2.h header.
However this is generally slower than strlen on most targets, even when
an optimized rawmemchr implementation exists. Since GCC7 optimizes
strchr (s, '\0') to strlen (s) + s, the GLIBC headers should not
transform this to rawmemchr. As GCC recognizes strchr as a builtin,
defining strchr as the builtin is not useful.
* string/bits/string2.h (strchr): Remove define.
Currently strsep calls strpbrk is is now a veneer to strcspn. Calling
strcspn directly is faster. Since it handles a delimiter string of size
1 as a special case, this is not needed in strsep itself. Although this
means there is a slightly higher overhead if the delimiter size is 1,
all other cases are slightly faster. The overall performance gain is 5-10%
on AArch64.
The string/bits/string2.h header contains optimizations for constant
delimiters of size 1-3. Benchmarking these showed similar performance for
size 1 (since in all cases strchr/strchrnul is used), while size 2 and 3
can give up to 2x speedup for small input strings. However if these cases
are common it seems much better to add this optimization to strcspn.
So move these header optimizations to string-inlines.c.
Improve the strsep benchmark so that it actually benchmarks something.
The current version contains a delimiter character at every position in the
input string, so there is very little work to do, and the extremely inefficent
simple_strsep implementation appears fastest in every case. The new version
has either no match in the input for the fail case and a match halfway in the
input for the success case. The input is then restored so that each iteration
does exactly the same amount of work. Reduce the number of testcases since
simple_strsep takes a lot of time now.
* benchtests/bench-strsep.c (oldstrsep): Add old implementation.
(do_one_test) Restore original string so iteration works.
* string/string-inlines.c (do_test): Create better input strings.
(test_main) Reduce number of testruns.
* string/string-inlines.c (__old_strsep_1c): New function.
(__old_strsep_2c): Likewise.
(__old_strsep_3c): Likewise.
* string/strsep.c (__strsep): Remove case of small delim string.
Call strcspn directly rather than strpbrk.
* string/bits/string2.h (__strsep): Remove define.
(__strsep_1c): Remove.
(__strsep_2c): Remove.
(__strsep_3c): Remove.
(strsep): Remove.
* sysdeps/unix/sysv/linux/internal_statvfs.c
(__statvfs_getflags): Rename to __strsep.
calls strcspn, call strcspn directly so we get the end of the token without
an extra call to rawmemchr. Also avoid an unnecessary call to strcspn after
the last token by adding an early exit for an empty string. Change strtok
to tailcall strtok_r to avoid unnecessary code duplication.
Remove the special header optimization for strtok_r of a 1-character
constant string - both strspn and strcspn contain optimizations for this
case. Benchmarking this showed similar performance in the worst case,
but up to 5.5x better performance in the "found" case for large inputs.
* benchtests/bench-strtok.c (oldstrtok): Add old implementation.
* string/strtok.c (strtok): Change to tailcall __strtok_r.
* string/strtok_r.c (__strtok_r): Optimize for performance.
* string/string-inlines.c (__old_strtok_r_1c): New function.
* string/bits/string2.h (__strtok_r): Move to string-inlines.c.
The comment above the bzero() macro in this file appears to have been
copied verbatim from the comment above the memset() prototype in
string.h proper. bzero() has no 'c' argument and can only set memory
contents to 0. (The comment above the prototype of bzero() in
string.h proper does not make the same mistake.)
* string/bits/string2.h: Fix typo in comment.
With now a faster strcspn implementation, it is faster to just use
it with some return tests than reimplementing strpbrk itself.
As for strcspn optimization, it is generally at least 10 times faster
than the existing implementation on bench-strspn on a few AArch64
implementations.
Also the string/bits/string2.h inlines make no longer sense, as current
implementation will already implement most of the optimizations.
Tested on x86_64, i386, and aarch64.
* string/strpbrk.c (strpbrk): Rewrite function.
* string/bits/string2.h (strpbrk): Use __builtin_strpbrk.
(__strpbrk_c2): Likewise.
(__strpbrk_c3): Likewise.
* string/string-inlines.c
[SHLIB_COMPAT(libc, GLIBC_2_1_1, GLIBC_2_24)] (__strpbrk_c2):
Likewise.
[SHLIB_COMPAT(libc, GLIBC_2_1_1, GLIBC_2_24)] (__strpbrk_c3):
Likewise.
As for strcspn, this patch improves strspn performance using a much
faster algorithm. It first constructs a 256-entry table based on
the accept string and then uses it as a lookup table for the
input string. As for strcspn optimization, it is generally at least
10 times faster than the existing implementation on bench-strspn
on a few AArch64 implementations.
Also the string/bits/string2.h inlines make no longer sense, as current
implementation will already implement most of the optimizations.
Tested on x86_64, i686, and aarch64.
* string/strspn.c (strcspn): Rewrite function.
* string/bits/string2.h (strspn): Use __builtin_strcspn.
(__strspn_c1): Remove inline function.
(__strspn_c2): Likewise.
(__strspn_c3): Likewise.
* string/string-inlines.c
[SHLIB_COMPAT(libc, GLIBC_2_1_1, GLIBC_2_24)] (__strspn_c1): Add
compatibility symbol.
[SHLIB_COMPAT(libc, GLIBC_2_1_1, GLIBC_2_24)] (__strspn_c2):
Likewise.
[SHLIB_COMPAT(libc, GLIBC_2_1_1, GLIBC_2_24)] (__strspn_c3):
Likewise.
Improve strcspn performance using a much faster algorithm. It is kept simple
so it works well on most targets. It is generally at least 10 times faster
than the existing implementation on bench-strcspn on a few AArch64
implementations, and for some tests 100 times as fast (repeatedly calling
strchr on a small string is extremely slow...).
In fact the string/bits/string2.h inlines make no longer sense, as GCC
already uses strlen if reject is an empty string, strchrnul is 5 times as
fast as __strcspn_c1, while __strcspn_c2 and __strcspn_c3 are slower than
the strcspn main loop for large strings (though reject length 2-4 could be
special cased in the future to gain even more performance).
Tested on x86_64, i686, and aarch64.
* string/Version (libc): Add GLIBC_2.24.
* string/strcspn.c (strcspn): Rewrite function.
* string/bits/string2.h (strcspn): Use __builtin_strcspn.
(__strcspn_c1): Remove inline function.
(__strcspn_c2): Likewise.
(__strcspn_c3): Likewise.
* string/string-inline.c
[SHLIB_COMPAT(libc, GLIBC_2_1_1, GLIBC_2_24)] (__strcspn_c1): Add
compatibility symbol.
[SHLIB_COMPAT(libc, GLIBC_2_1_1, GLIBC_2_24)] (__strcspn_c2):
Likewise.
[SHLIB_COMPAT(libc, GLIBC_2_1_1, GLIBC_2_24)] (__strcspn_c3):
Likewise.
* sysdeps/i386/string-inlines.c: Include generic string-inlines.c.
As discussed in
https://sourceware.org/ml/libc-alpha/2015-10/msg00403.html
the setting of _STRING_ARCH_unaligned currently controls the external
GLIBC ABI as well as selecting the use of unaligned accesses withing
GLIBC.
Since _STRING_ARCH_unaligned was recently changed for AArch64, this
would potentially break the ABI in GLIBC 2.23, so split the uses and add
_STRING_INLINE_unaligned to select the string ABI. This setting must be
fixed for each target, while _STRING_ARCH_unaligned may be changed from
release to release. _STRING_ARCH_unaligned is used unconditionally in
glibc. But <bits/string.h>, which defines _STRING_ARCH_unaligned, isn't
included with -Os. Since _STRING_ARCH_unaligned is internal to glibc and
may change between glibc releases, it should be made private to glibc.
_STRING_ARCH_unaligned should defined in the new string_private.h heade
file which is included unconditionally from internal <string.h> for glibc
build.
[BZ #19462]
* bits/string.h (_STRING_ARCH_unaligned): Renamed to ...
(_STRING_INLINE_unaligned): This.
* include/string.h: Include <string_private.h>.
* string/bits/string2.h: Replace _STRING_ARCH_unaligned with
_STRING_INLINE_unaligned.
* sysdeps/aarch64/bits/string.h (_STRING_ARCH_unaligned): Removed.
(_STRING_INLINE_unaligned): New.
* sysdeps/aarch64/string_private.h: New file.
* sysdeps/generic/string_private.h: Likewise.
* sysdeps/m68k/m680x0/m68020/string_private.h: Likewise.
* sysdeps/s390/string_private.h: Likewise.
* sysdeps/x86/string_private.h: Likewise.
* sysdeps/m68k/m680x0/m68020/bits/string.h
(_STRING_ARCH_unaligned): Renamed to ...
(_STRING_INLINE_unaligned): This.
* sysdeps/s390/bits/string.h (_STRING_ARCH_unaligned): Renamed
to ...
(_STRING_INLINE_unaligned): This.
* sysdeps/sparc/bits/string.h (_STRING_ARCH_unaligned): Renamed
to ...
(_STRING_INLINE_unaligned): This.
* sysdeps/x86/bits/string.h (_STRING_ARCH_unaligned): Renamed
to ...
(_STRING_INLINE_unaligned): This.
This patch cleans up cases of __USE_MISC that are trivially redundant
after the recent substitution of __USE_MISC for __USE_BSD and
__USE_SVID: either in constructs such as "defined __USE_MISC ||
defined __USE_MISC", or else (in the bits/mman.h case) a conditional
on __USE_MISC nested inside another __USE_MISC conditional. (The
cleanups remaining after this patch are still quite large, but it
seems a reasonable piece to separate out.)
Tested x86_64.
* bits/mman.h [__USE_MISC]: Remove redundant conditionals.
* ctype/ctype.h [__USE_MISC]: Likewise.
* dirent/dirent.h [__USE_MISC]: Likewise.
* grp/grp.h [__USE_MISC]: Likewise.
* io/fcntl.h [__USE_MISC]: Likewise.
* io/sys/stat.h [__USE_MISC]: Likewise.
* libio/stdio.h [__USE_MISC]: Likewise.
* posix/unistd.h [__USE_MISC]: Likewise.
* pwd/pwd.h [__USE_MISC]: Likewise.
* stdlib.h [__USE_MISC]: Likewise.
* string/bits/string2.h [__USE_MISC]: Likewise.
* string/string.h [__USE_MISC]: Likewise.
* time/time.h [__USE_MISC]: Likewise.
[BZ #14083]
Fix warning when using strspn with -Wconversion:
$ gcc -Wconversion -O t.c
t.c: In function ‘main’:
t.c:8:7: warning: conversion to ‘long unsigned int’ from ‘int’ may change the sign of the result [-Wsign-conversion]
* string/bits/string2.h (__strdup): Cast parameters to calloc to
avoid warning with -Wconversion.
(__strndup): Likewise.
Patch to 50% by Christian Iseli <christian.iseli@licr.org>.
2004-05-26 Jakub Jelinek <jakub@redhat.com>
* include/string.h (mempcpy, stpcpy): Add libc_hidden_builtin_proto.
* string/bits/string2.h (memset): Disable macro for GCC 3.0+.
(__mempcpy): Use __builtin_mempcpy for GCC 3.4+.
(strchr): For GCC 3.2+, only use __rawmemchr if second argument is
constant '\0' and first argument is not constant.
(__stpcpy): Use __builtin_stpcpy for GCC 3.4+.
(strncpy): Remove #ifdef _USE_STRING_ARCH_mempcpy variant.
For GCC 3.2+ use __builtin_strncpy.
(strncat): For GCC 3.2+ use __builtin_strncat.
(strcmp): For GCC 3.2+ use __builtin_strcmp if both arguments are
constant.
(strcspn, strspn, strpbrk): For GCC 3.2+, use builtin function
if both arguments are constant.
2004-05-26 Ulrich Drepper <drepper@redhat.com>
* nss/nss_files/files-hosts.c: Fix condition for looking up IPv4
mapped addresses in gethostbyaddr.
2002-05-27 Alexandre Oliva <aoliva@redhat.com>
* configure.in (DO_STATIC_NSS): Define if --disable-shared.
2002-05-26 Bruno Haible <bruno@clisp.org>
* iconvdata/iso-2022-jp.c (BODY for TO_LOOP): Avoid running off the
end of the ISO-8859-7 from idx table.
2002-05-27 Ulrich Drepper <drepper@redhat.com>
* manual/lang.texi: Fix FLT_EPSILON description [PR libc/3649].
2002-05-24 David S. Miller <davem@redhat.com>
* string/bits/string2.h (memset): Do not try to optimize when
not _STRING_ARCH_unaligned if GCC will do the right thing.
2002-01-23 Richard Henderson <rth@redhat.com>
* sysdeps/alpha/Makefile (pic-ccflag): New variable.
2002-01-28 Ulrich Drepper <drepper@redhat.com>
* string/strxfrm.c: Allocate one more byte for rulearr and clear
this element [PR libc/2855].
* string/strcoll.c: Handle zero-length arguments specially
[PR libc/2856].
2002-01-23 Jakub Jelinek <jakub@redhat.com>
* string/bits/string2.h (__mempcpy): For gcc 3.0+, don't use
__mempcpy_small but instead use __builtin_memcpy ( , , n) + n for
short lengths and constant src.
(strcpy): Don't optimize for gcc 3.0+.
(__stpcpy): For gcc 3.0+, don't use
__stpcpy_small but instead use __builtin_strcpy (, src) + strlen (src)
for short string literal src.
2002-01-23 Jeroen Dobbelaere <jeroen.dobbelaere@acunia.com>
* sysdeps/unix/sysv/linux/configure.in (libc_cv_gcc_unwind_find_fde):
Set for arm, too.
2001-01-22 Paul Eggert <eggert@twinsun.com>
* manual/llio.texi (Linked Channels, Cleaning Streams):
Make it clearer that a just-opened input stream might need cleaning.
2002-01-21 H.J. Lu <hjl@gnu.org>
* sysdeps/mips/dl-machine.h (ELF_MACHINE_BEFORE_RTLD_RELOC):
Don't use label at end of compound statement.
2001-11-02 Jakub Jelinek <jakub@redhat.com>
* string/bits/string2.h (__strndup): If n is smaller than len, set
len to n + 1.
* string/tester.c (test_strndup): New function.
(main): Call it.
* sunrpc/rpc_main.c: Optimize variable definitions a bit.
2001-10-04 Ben Collins <bcollins@debian.org>
* sysdeps/generic/inttypes.h: Fix typo (define, not defined) in
decleration of __need_wchar_t.
2001-10-03 Jakub Jelinek <jakub@redhat.com>
* string/bits/string2.h (__strsep_g): Add prototype.
(__strsep): Use it.
* string/Versions (__strsep): Remove.
* sysdeps/generic/strsep.c (__strsep_g): Add alias to __strsep.
2001-10-07 Ulrich Drepper <drepper@redhat.com>
* manua/llio.texi: Clarify file references added by mmap.
Patch by Marcus Brinkmann <Marcus.Brinkmann@ruhr-uni-bochum.de>.
2001-07-06 Paul Eggert <eggert@twinsun.com>
* manual/argp.texi: Remove ignored LGPL copyright notice; it's
not appropriate for documentation anyway.
* manual/libc-texinfo.sh: "Library General Public License" ->
"Lesser General Public License".
2001-07-06 Andreas Jaeger <aj@suse.de>
* All files under GPL/LGPL version 2: Place under LGPL version
2.1.
* string/bits/string2.h (strspn): Evaluate first argument if
second is "".
(strpbrk): Likewise.
* sysdeps/i386/i486/bits/string.h: Likewise.
* string/Makefile (tests): Add bug-strspn1 and bug-strpbrk1.
* string/bug-strspn1.c: New file.
* string/bug-strpbrk1.c: New file.
Test cases by Joseph S. Myers <jsm28@cam.ac.uk>.
* string/bits/string2.h (strncat): Terminate string correctly.
* sysdeps/i386/i486/bits/string.h (strncat): Likewise.
* string/Makefile (tests): Add bug-strncat1.
* string/bug-strncat1.c: New file.
Test case by Joseph S. Myers <jsm28@cam.ac.uk>.
2000-10-27 Ben Collins <bcollins@debian.org>
* sysdeps/generic/lockf.c (lockf): Set l_type to F_RDLCK before
calling for F_GETLK.
2000-10-29 Ulrich Drepper <drepper@redhat.com>