The current racy approach is to enable asynchronous cancellation
before making the syscall and restore the previous cancellation
type once the syscall returns, and check if cancellation has happen
during the cancellation entrypoint.
As described in BZ#12683, this approach shows 2 problems:
1. Cancellation can act after the syscall has returned from the
kernel, but before userspace saves the return value. It might
result in a resource leak if the syscall allocated a resource or a
side effect (partial read/write), and there is no way to program
handle it with cancellation handlers.
2. If a signal is handled while the thread is blocked at a cancellable
syscall, the entire signal handler runs with asynchronous
cancellation enabled. This can lead to issues if the signal
handler call functions which are async-signal-safe but not
async-cancel-safe.
For the cancellation to work correctly, there are 5 points at which the
cancellation signal could arrive:
[ ... )[ ... )[ syscall ]( ...
1 2 3 4 5
1. Before initial testcancel, e.g. [*... testcancel)
2. Between testcancel and syscall start, e.g. [testcancel...syscall start)
3. While syscall is blocked and no side effects have yet taken
place, e.g. [ syscall ]
4. Same as 3 but with side-effects having occurred (e.g. a partial
read or write).
5. After syscall end e.g. (syscall end...*]
And libc wants to act on cancellation in cases 1, 2, and 3 but not
in cases 4 or 5. For the 4 and 5 cases, the cancellation will eventually
happen in the next cancellable entrypoint without any further external
event.
The proposed solution for each case is:
1. Do a conditional branch based on whether the thread has received
a cancellation request;
2. It can be caught by the signal handler determining that the saved
program counter (from the ucontext_t) is in some address range
beginning just before the "testcancel" and ending with the
syscall instruction.
3. SIGCANCEL can be caught by the signal handler and determine that
the saved program counter (from the ucontext_t) is in the address
range beginning just before "testcancel" and ending with the first
uninterruptable (via a signal) syscall instruction that enters the
kernel.
4. In this case, except for certain syscalls that ALWAYS fail with
EINTR even for non-interrupting signals, the kernel will reset
the program counter to point at the syscall instruction during
signal handling, so that the syscall is restarted when the signal
handler returns. So, from the signal handler's standpoint, this
looks the same as case 2, and thus it's taken care of.
5. For syscalls with side-effects, the kernel cannot restart the
syscall; when it's interrupted by a signal, the kernel must cause
the syscall to return with whatever partial result is obtained
(e.g. partial read or write).
6. The saved program counter points just after the syscall
instruction, so the signal handler won't act on cancellation.
This is similar to 4. since the program counter is past the syscall
instruction.
So The proposed fixes are:
1. Remove the enable_asynccancel/disable_asynccancel function usage in
cancellable syscall definition and instead make them call a common
symbol that will check if cancellation is enabled (__syscall_cancel
at nptl/cancellation.c), call the arch-specific cancellable
entry-point (__syscall_cancel_arch), and cancel the thread when
required.
2. Provide an arch-specific generic system call wrapper function
that contains global markers. These markers will be used in
SIGCANCEL signal handler to check if the interruption has been
called in a valid syscall and if the syscalls has side-effects.
A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c
is provided. However, the markers may not be set on correct
expected places depending on how INTERNAL_SYSCALL_NCS is
implemented by the architecture. It is expected that all
architectures add an arch-specific implementation.
3. Rewrite SIGCANCEL asynchronous handler to check for both canceling
type and if current IP from signal handler falls between the global
markers and act accordingly.
4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to
use the appropriate cancelable syscalls.
5. Adjust 'lowlevellock-futex.h' arch-specific implementations to
provide cancelable futex calls.
Some architectures require specific support on syscall handling:
* On i386 the syscall cancel bridge needs to use the old int80
instruction because the optimized vDSO symbol the resulting PC value
for an interrupted syscall points to an address outside the expected
markers in __syscall_cancel_arch. It has been discussed in LKML [1]
on how kernel could help userland to accomplish it, but afaik
discussion has stalled.
Also, sysenter should not be used directly by libc since its calling
convention is set by the kernel depending of the underlying x86 chip
(check kernel commit 30bfa7b3488bfb1bb75c9f50a5fcac1832970c60).
* mips o32 is the only kABI that requires 7 argument syscall, and to
avoid add a requirement on all architectures to support it, mips
support is added with extra internal defines.
Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc-linux-gnu,
powerpc64-linux-gnu, powerpc64le-linux-gnu, i686-linux-gnu, and
x86_64-linux-gnu.
[1] https://lkml.org/lkml/2016/3/8/1105
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This allows us to define a generic no-op version of PTR_MANGLE and
PTR_DEMANGLE. In the future, we can use PTR_MANGLE and PTR_DEMANGLE
unconditionally in C sources, avoiding an unintended loss of hardening
due to missing include files or unlucky header inclusion ordering.
In i386 and x86_64, we can avoid a <tls.h> dependency in the C
code by using the computed constant from <tcb-offsets.h>. <sysdep.h>
no longer includes these definitions, so there is no cyclic dependency
anymore when computing the <tcb-offsets.h> constants.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.
I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah. I don't
know why I run into these diagnostics whereas others evidently do not.
remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
I used these shell commands:
../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")
and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
It removes all the arch-specific assembly implementation. The
outliers are alpha, where its kernel ABI explict return -ENOMEM
in case of failure; and i686, where it can't use
"call *%gs:SYSINFO_OFFSET" during statup in static PIE.
Also some ABIs exports an additional ___brk_addr symbol and to
handle it an internal HAVE_INTERNAL_BRK_ADDR_SYMBOL is added.
Checked on x86_64-linux-gnu, i686-linux-gnu, adn with builsd for
the affected ABIs.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
With all Linux ABIs using the expected Linux kABI to indicate
syscalls errors, the INTERNAL_SYSCALL_DECL is an empty declaration
on all ports.
This patch removes the 'err' argument on INTERNAL_SYSCALL* macro
and remove the INTERNAL_SYSCALL_DECL usage.
Checked with a build against all affected ABIs.
With all Linux ABIs using the expected Linux kABI to indicate
syscalls errors, there is no need to replicate the INLINE_SYSCALL.
The generic Linux sysdep.h includes errno.h even for !__ASSEMBLER__,
which is ok now and it allows cleanup some archaic code that assume
otherwise.
Checked with a build against all affected ABIs.
It changes the mips INTERNAL_SYSCALL* and internal_syscall* macros
to return a negative value instead of the 'a3' register value on then
'err' macro argument.
The macro INTERNAL_SYSCALL_DECL is no longer required, and the
INTERNAL_SYSCALL_ERROR_P macro follows the other Linux kABIs.
The redefinition of INTERNAL_VSYSCALL_CALL is also no longer
required.
Checked on mips64-linux-gnu, mips64n32-linux-gnu, and mips-linux-gnu.
According to [gcc documentation][1], temporary variables must be used for
the desired content to not be call-clobbered.
Fix the Linux inline syscall templates by adding temporary variables,
much like what x86 did before
(commit 381a0c26d73e0f074c962e0ab53b99a6c327066d).
Tested with gcc 9.2.0, both cross-compiled and natively on Loongson
3A4000.
[1]: https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html
GCC 10 (PR 91233) won't silently allow registers that are not architecturally
available to be present in the clobber list anymore, resulting in build failure
for mips*r6 targets in form of:
...
.../sysdep.h:146:2: error: the register ‘lo’ cannot be clobbered in ‘asm’ for the current target
146 | __asm__ volatile ( \
| ^~~~~~~
This is because base R6 ISA doesn't define hi and lo registers w/o DSP extension.
This patch provides the alternative definitions of __SYSCALL_CLOBBERS for r6
targets that won't include those registers.
* sysdeps/unix/sysv/linux/mips/mips32/sysdep.h (__SYSCALL_CLOBBERS): Exclude
hi and lo from the clobber list for __mips_isa_rev >= 6.
* sysdeps/unix/sysv/linux/mips/mips64/n32/sysdep.h (__SYSCALL_CLOBBERS): Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n64/sysdep.h (__SYSCALL_CLOBBERS): Likewise.
This patch consolidates the mips, mips64, and mips64-n32
INTERNAL_VSYSCALL_CALL on a single implementation.
No semantic changes. I checked against a build for mips-linux-gnu,
mips64-linux-gnu, and mips64-n32-linux-gnu.
* sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
(INTERNAL_VSYSCALL_CALL): Remove.
* sysdeps/unix/sysv/linux/mips/mips64/n32/sysdep.h
(INTERNAL_VSYSCALL_CALL): Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n64/sysdep.h
(INTERNAL_VSYSCALL_CALL): Likewise.
* sysdeps/unix/sysv/linux/mips/sysdep.h (INTERNAL_VSYSCALL_CALL):
New macro.
This patch assumes static vDSO is supported as default, it is now supported
on all current architectures that support vDSO. It allows removing both
ALWAYS_USE_VSYSCALL define, which an architecture requires to explicit define
and USE_VSYSCALL (which defines vDSO only for shared or if architecture defines
ALWAYS_USE_VSYSCALL).
Checked with a build against all affected ABIs.
[BZ #19767]
* sysdeps/unix/sysv/linux/aarch64/sysdep.h (ALWAYS_USE_VSYSCALL):
Remove definition.
* sysdeps/unix/sysv/linux/arm/sysdep.h (ALWAYS_USE_VSYSCALL):
Likewise.
* sysdeps/unix/sysv/linux/i386/sysdep.h (ALWAYS_USE_VSYSCALL):
Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/sysdep.h (ALWAYS_USE_VSYSCALL):
Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n32/sysdep.h
(ALWAYS_USE_VSYSCALL): Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n64/sysdep.h
(ALWAYS_USE_VSYSCALL): Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h
(ALWAYS_USE_VSYSCALL): Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h
(ALWAYS_USE_VSYSCALL): Likewise.
* sysdeps/unix/sysv/linux/riscv/sysdep.h (ALWAYS_USE_VSYSCALL):
Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/sysdep.h
(ALWAYS_USE_VSYSCALL): Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/sysdep.h
(ALWAYS_USE_VSYSCALL): Likewise.
* sysdeps/unix/sysv/linux/sparc/sysdep.h (ALWAYS_USE_VSYSCALL):
Likewise.
* sysdeps/unix/sysv/linux/x86_64/sysdep.h (ALWAYS_USE_VSYSCALL):
Likewise.
* sysdeps/unix/sysv/linux/x86/libc-vdso.h: Remove #if USE_VSYSCALL.
* sysdeps/unix/sysv/linux/sysdep-vdso.h: Likewise.
* sysdeps/unix/sysv/linux/sysdep.h (ALWAYS_USE_VSYSCALL,
USE_VSYSCALL): Remove defitions.
I have tested that this builds and the resulting program still work.
This was tested on gcc23.fsffrance.org, and for some reason the vdso
there seems unused even when using shared libraries.
[BZ #19767]
* sysdeps/unix/sysv/linux/mips/init-first.c: Remove #ifdef SHARED.
* sysdeps/unix/sysv/linux/mips/libc-vdso.h: Remove #ifdef SHARED.
* sysdeps/unix/sysv/linux/mips/mips32/sysdep.h: Define
ALWAYS_USE_VSYSCALL.
* sysdeps/unix/sysv/linux/mips/mips64/n32/sysdep.h: Define
ALWAYS_USE_VSYSCALL.
* sysdeps/unix/sysv/linux/mips/mips64/n64/sysdep.h: Define
ALWAYS_USE_VSYSCALL.
Fix a commit cc25c8b4c119 ("New pthread rwlock that is more scalable.")
regression and prevent uncontrolled stack space usage from happening
when a 5-, 6- or 7-argument syscall wrapper is placed in a loop.
The cause of the problem is the use of `alloca' in regular MIPS/Linux
wrappers to force the use of the frame pointer register in any function
using one or more of these wrappers. Using the frame pointer register
is required so as not to break frame unwinding as the the stack pointer
is lowered within the inline asm used by these wrappers to make room for
the stack arguments, which 5-, 6- and 7-argument syscalls use with the
o32 ABI.
The regular MIPS/Linux wrappers are macros however, expanded inline, and
stack allocations made with `alloca' are not discarded until the return
of the function they are made in. Consequently if called in a loop,
then virtual memory is wasted, and if the loop goes through enough
iterations, then ultimately available memory can get exhausted causing
the program to crash.
Address the issue by replacing the inline code with standalone assembly
functions, which rely on the compiler arranging syscall arguments
according to the o32 function calling convention, which MIPS/Linux
syscalls also use, except for the syscall number passed and the error
flag returned. This way there is no need to fiddle with the stack
pointer anymore and all that has to be handled in the new standalone
functions is the special handling of the syscall number and the error
flag.
Redirect 5-, 6- or 7-argument MIPS16/Linux syscall wrappers to these new
functions as well, so as to avoid an unnecessary double call the
existing wrappers would cause with the new arrangement.
[BZ #21956]
* sysdeps/unix/sysv/linux/mips/mips32/mips16/Makefile
[subdir = misc] (sysdep_routines): Remove `mips16-syscall5',
`mips16-syscall6' and `mips16-syscall7'.
(CFLAGS-mips16-syscall5.c, CFLAGS-mips16-syscall6.c)
(CFLAGS-mips16-syscall7.c): Remove.
* sysdeps/unix/sysv/linux/mips/mips32/mips16/Versions (libc):
Remove `__mips16_syscall5', `__mips16_syscall6' and
`__mips16_syscall7'.
* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall0.c
(__mips16_syscall0): Rename `__mips16_syscall_return' to
`__mips_syscall_return'.
* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall1.c
(__mips16_syscall1): Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall2.c
(__mips16_syscall2): Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall3.c
(__mips16_syscall3): Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall4.c
(__mips16_syscall4): Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall5.c:
Remove.
* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall6.c:
Remove.
* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall7.c:
Remove.
* sysdeps/unix/sysv/linux/mips/mips32/mips16/mips16-syscall.h
(__mips16_syscall5): Expand to `__mips_syscall5' rather than
`__mips16_syscall5'. Remove prototype.
(__mips16_syscall6): Expand to `__mips_syscall6' rather than
`__mips16_syscall6'. Remove prototype.
(__mips16_syscall7): Expand to `__mips_syscall7' rather than
`__mips16_syscall7'. Remove prototype.
(__nomips16, __mips16_syscall_return): Move to...
* sysdeps/unix/sysv/linux/mips/mips32/sysdep.h
(__nomips16, __mips_syscall_return): ... here.
[__mips16] (INTERNAL_SYSCALL_NCS): Rename
`__mips16_syscall_return' to `__mips_syscall_return'.
[__mips16] (INTERNAL_SYSCALL_MIPS16): Pass `number' to
`internal_syscall##nr'.
[!__mips16] (INTERNAL_SYSCALL): Pass `SYS_ify (name)' to
`internal_syscall##nr'.
(FORCE_FRAME_POINTER): Remove.
(__mips_syscall5): New prototype.
(internal_syscall5): Rewrite to call `__mips_syscall5'.
(__mips_syscall6): New prototype.
(internal_syscall6): Rewrite to call `__mips_syscall6'.
(__mips_syscall7): New prototype.
(internal_syscall7): Rewrite to call `__mips_syscall7'.
* sysdeps/unix/sysv/linux/mips/mips32/mips-syscall5.S: New file.
* sysdeps/unix/sysv/linux/mips/mips32/mips-syscall6.S: New file.
* sysdeps/unix/sysv/linux/mips/mips32/mips-syscall7.S: New file.
* sysdeps/unix/sysv/linux/mips/mips32/Makefile [subdir = misc]
(sysdep_routines): Add libc-do-syscall.
* sysdeps/unix/sysv/linux/mips/mips32/Versions (libc): Add
`__mips_syscall5', `__mips_syscall6' and `__mips_syscall7'.
This patch adds support for using the implementations of gettimeofday()
and clock_gettime() provided by the kernel in the VDSO. The VDSO will
always provide clock_gettime() as CLOCK_{REALTIME,MONOTONIC}_COARSE can
be implemented regardless of platform. CLOCK_{REALTIME,MONOTONIC}, along
with gettimeofday(), are only implemented on platforms which make use of
either the CP0 count or GIC as their clocksource. On other platforms,
the VDSO does not provide the __vdso_gettimeofday symbol, as it is
never useful.
The VDSO functions return ENOSYS when they encounter an unsupported
request, in which case glibc should fall back to the standard syscall.
Tested with upstream kernel 4.5 and QEMU emulating Malta.
./vdsotest gettimeofday bench
gettimeofday: syscall: 1021 nsec/call
gettimeofday: libc: 262 nsec/call
gettimeofday: vdso: 174 nsec/call
* sysdeps/unix/sysv/linux/mips/Makefile (sysdep_routines):
Include dl-vdso.
* sysdeps/unix/sysv/linux/mips/Versions: Add
__vdso_clock_gettime.
* sysdeps/unix/sysv/linux/mips/init-first.c: New file.
* sysdeps/unix/sysv/linux/mips/libc-vdso.h: New file.
* sysdeps/unix/sysv/linux/mips/mips32/sysdep.h:
(INTERNAL_VSYSCALL_CALL): Define to be compatible with MIPS
definitions of INTERNAL_SYSCALL_{ERROR_P,ERRNO}.
(HAVE_CLOCK_GETTIME_VSYSCALL): Define.
(HAVE_GETTIMEOFDAY_VSYSCALL): Define.
* sysdeps/unix/sysv/linux/mips/mips64/n32/sysdep.h: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/n64/sysdep.h: Likewise.
Carlos noted in
<https://sourceware.org/ml/libc-alpha/2015-05/msg00680.html> that
various ports use potentially problematic short variables names in
their syscall macros, which could shadow variables with the same name
from containing scopes.
This patch fixes variables called err and ret in MIPS macros. (I left
result_var and _sys_result - separate variables in different macros,
which need separate names - alone.)
Tested for mips64 (all three ABIs) that installed stripped shared
libraries are unchanged by this patch.
* sysdeps/unix/sysv/linux/mips/mips32/sysdep.h (INLINE_SYSCALL):
Use variable name _sc_err instead of err.
[__mips16] (INTERNAL_SYSCALL_NCS): Use variable name _sc_ret
instead of ret.
* sysdeps/unix/sysv/linux/mips/mips64/n32/sysdep.h
(INLINE_SYSCALL): Use variable name _sc_err instead of err.
* sysdeps/unix/sysv/linux/mips/mips64/n64/sysdep.h
(INLINE_SYSCALL): Likewise.
I've moved the MIPS port from ports to the main sysdeps hierarchy.
Beyond the README update, the move of the files was simply
git mv ports/sysdeps/mips sysdeps/mips
git mv ports/sysdeps/unix/mips sysdeps/unix/mips
git mv ports/sysdeps/unix/sysv/linux/mips sysdeps/unix/sysv/linux/mips
and in addition to the ChangeLog entries here, I put a note at the top
of ports/ChangeLog.mips similar to those in other files.
Tested that disassembly of installed shared libraries for mips is the
same before and after this patch (except for ld.so where paths in
assertions are involved, as for arm).
* sysdeps/mips: Move directory from ports/sysdeps/mips.
* sysdeps/unix/mips: Move directory from ports/sysdeps/unix/mips.
* sysdeps/unix/sysv/linux/mips: Move directory from
ports/sysdeps/unix/sysv/linux/mips.
* README: Update listing for mips-*-linux-gnu and
mips64-*-linux-gnu.
* sysdeps/mips: Move directory to ../sysdeps/mips.
* sysdeps/unix/mips: Move directory to ../sysdeps/unix/mips.
* sysdeps/unix/sysv/linux/mips: Move directory to
../sysdeps/unix/sysv/linux/mips.