mirror of
				https://sourceware.org/git/glibc.git
				synced 2025-10-26 00:57:39 +03:00 
			
		
		
		
	2000-08-25 Andreas Jaeger <aj@suse.de> * manual/arith.texi (Control Functions): Clarify possible arguments. Closes PR libc/1856.
		
			
				
	
	
		
			2555 lines
		
	
	
		
			91 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			2555 lines
		
	
	
		
			91 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| @node Arithmetic, Date and Time, Mathematics, Top
 | |
| @c %MENU% Low level arithmetic functions
 | |
| @chapter Arithmetic Functions
 | |
| 
 | |
| This chapter contains information about functions for doing basic
 | |
| arithmetic operations, such as splitting a float into its integer and
 | |
| fractional parts or retrieving the imaginary part of a complex value.
 | |
| These functions are declared in the header files @file{math.h} and
 | |
| @file{complex.h}.
 | |
| 
 | |
| @menu
 | |
| * Integers::                    Basic integer types and concepts
 | |
| * Integer Division::            Integer division with guaranteed rounding.
 | |
| * Floating Point Numbers::      Basic concepts.  IEEE 754.
 | |
| * Floating Point Classes::      The five kinds of floating-point number.
 | |
| * Floating Point Errors::       When something goes wrong in a calculation.
 | |
| * Rounding::                    Controlling how results are rounded.
 | |
| * Control Functions::           Saving and restoring the FPU's state.
 | |
| * Arithmetic Functions::        Fundamental operations provided by the library.
 | |
| * Complex Numbers::             The types.  Writing complex constants.
 | |
| * Operations on Complex::       Projection, conjugation, decomposition.
 | |
| * Parsing of Numbers::          Converting strings to numbers.
 | |
| * System V Number Conversion::  An archaic way to convert numbers to strings.
 | |
| @end menu
 | |
| 
 | |
| @node Integers
 | |
| @section Integers
 | |
| @cindex integer
 | |
| 
 | |
| The C language defines several integer data types: integer, short integer,
 | |
| long integer, and character, all in both signed and unsigned varieties.
 | |
| The GNU C compiler extends the language to contain long long integers
 | |
| as well.
 | |
| @cindex signedness
 | |
| 
 | |
| The C integer types were intended to allow code to be portable among
 | |
| machines with different inherent data sizes (word sizes), so each type
 | |
| may have different ranges on different machines.  The problem with
 | |
| this is that a program often needs to be written for a particular range
 | |
| of integers, and sometimes must be written for a particular size of
 | |
| storage, regardless of what machine the program runs on.
 | |
| 
 | |
| To address this problem, the GNU C library contains C type definitions
 | |
| you can use to declare integers that meet your exact needs.  Because the
 | |
| GNU C library header files are customized to a specific machine, your
 | |
| program source code doesn't have to be.
 | |
| 
 | |
| These @code{typedef}s are in @file{stdint.h}.
 | |
| @pindex stdint.h
 | |
| 
 | |
| If you require that an integer be represented in exactly N bits, use one
 | |
| of the following types, with the obvious mapping to bit size and signedness:
 | |
| 
 | |
| @itemize @bullet
 | |
| @item int8_t
 | |
| @item int16_t
 | |
| @item int32_t
 | |
| @item int64_t
 | |
| @item uint8_t
 | |
| @item uint16_t
 | |
| @item uint32_t
 | |
| @item uint64_t
 | |
| @end itemize
 | |
| 
 | |
| If your C compiler and target machine do not allow integers of a certain
 | |
| size, the corresponding above type does not exist.
 | |
| 
 | |
| If you don't need a specific storage size, but want the smallest data
 | |
| structure with @emph{at least} N bits, use one of these:
 | |
| 
 | |
| @itemize @bullet
 | |
| @item int8_least_t
 | |
| @item int16_least_t
 | |
| @item int32_least_t
 | |
| @item int64_least_t
 | |
| @item uint8_least_t
 | |
| @item uint16_least_t
 | |
| @item uint32_least_t
 | |
| @item uint64_least_t
 | |
| @end itemize
 | |
| 
 | |
| If you don't need a specific storage size, but want the data structure
 | |
| that allows the fastest access while having at least N bits (and
 | |
| among data structures with the same access speed, the smallest one), use
 | |
| one of these:
 | |
| 
 | |
| @itemize @bullet
 | |
| @item int8_fast_t
 | |
| @item int16_fast_t
 | |
| @item int32_fast_t
 | |
| @item int64_fast_t
 | |
| @item uint8_fast_t
 | |
| @item uint16_fast_t
 | |
| @item uint32_fast_t
 | |
| @item uint64_fast_t
 | |
| @end itemize
 | |
| 
 | |
| If you want an integer with the widest range possible on the platform on
 | |
| which it is being used, use one of the following.  If you use these,
 | |
| you should write code that takes into account the variable size and range
 | |
| of the integer.
 | |
| 
 | |
| @itemize @bullet
 | |
| @item intmax_t
 | |
| @item uintmax_t
 | |
| @end itemize
 | |
| 
 | |
| The GNU C library also provides macros that tell you the maximum and
 | |
| minimum possible values for each integer data type.  The macro names
 | |
| follow these examples: @code{INT32_MAX}, @code{UINT8_MAX},
 | |
| @code{INT_FAST32_MIN}, @code{INT_LEAST64_MIN}, @code{UINTMAX_MAX},
 | |
| @code{INTMAX_MAX}, @code{INTMAX_MIN}.  Note that there are no macros for
 | |
| unsigned integer minima.  These are always zero.
 | |
| @cindex maximum possible integer
 | |
| @cindex mininum possible integer
 | |
| 
 | |
| There are similar macros for use with C's built in integer types which
 | |
| should come with your C compiler.  These are described in @ref{Data Type
 | |
| Measurements}.
 | |
| 
 | |
| Don't forget you can use the C @code{sizeof} function with any of these
 | |
| data types to get the number of bytes of storage each uses.
 | |
| 
 | |
| 
 | |
| @node Integer Division
 | |
| @section Integer Division
 | |
| @cindex integer division functions
 | |
| 
 | |
| This section describes functions for performing integer division.  These
 | |
| functions are redundant when GNU CC is used, because in GNU C the
 | |
| @samp{/} operator always rounds towards zero.  But in other C
 | |
| implementations, @samp{/} may round differently with negative arguments.
 | |
| @code{div} and @code{ldiv} are useful because they specify how to round
 | |
| the quotient: towards zero.  The remainder has the same sign as the
 | |
| numerator.
 | |
| 
 | |
| These functions are specified to return a result @var{r} such that the value
 | |
| @code{@var{r}.quot*@var{denominator} + @var{r}.rem} equals
 | |
| @var{numerator}.
 | |
| 
 | |
| @pindex stdlib.h
 | |
| To use these facilities, you should include the header file
 | |
| @file{stdlib.h} in your program.
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment ISO
 | |
| @deftp {Data Type} div_t
 | |
| This is a structure type used to hold the result returned by the @code{div}
 | |
| function.  It has the following members:
 | |
| 
 | |
| @table @code
 | |
| @item int quot
 | |
| The quotient from the division.
 | |
| 
 | |
| @item int rem
 | |
| The remainder from the division.
 | |
| @end table
 | |
| @end deftp
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment ISO
 | |
| @deftypefun div_t div (int @var{numerator}, int @var{denominator})
 | |
| This function @code{div} computes the quotient and remainder from
 | |
| the division of @var{numerator} by @var{denominator}, returning the
 | |
| result in a structure of type @code{div_t}.
 | |
| 
 | |
| If the result cannot be represented (as in a division by zero), the
 | |
| behavior is undefined.
 | |
| 
 | |
| Here is an example, albeit not a very useful one.
 | |
| 
 | |
| @smallexample
 | |
| div_t result;
 | |
| result = div (20, -6);
 | |
| @end smallexample
 | |
| 
 | |
| @noindent
 | |
| Now @code{result.quot} is @code{-3} and @code{result.rem} is @code{2}.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment ISO
 | |
| @deftp {Data Type} ldiv_t
 | |
| This is a structure type used to hold the result returned by the @code{ldiv}
 | |
| function.  It has the following members:
 | |
| 
 | |
| @table @code
 | |
| @item long int quot
 | |
| The quotient from the division.
 | |
| 
 | |
| @item long int rem
 | |
| The remainder from the division.
 | |
| @end table
 | |
| 
 | |
| (This is identical to @code{div_t} except that the components are of
 | |
| type @code{long int} rather than @code{int}.)
 | |
| @end deftp
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment ISO
 | |
| @deftypefun ldiv_t ldiv (long int @var{numerator}, long int @var{denominator})
 | |
| The @code{ldiv} function is similar to @code{div}, except that the
 | |
| arguments are of type @code{long int} and the result is returned as a
 | |
| structure of type @code{ldiv_t}.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment ISO
 | |
| @deftp {Data Type} lldiv_t
 | |
| This is a structure type used to hold the result returned by the @code{lldiv}
 | |
| function.  It has the following members:
 | |
| 
 | |
| @table @code
 | |
| @item long long int quot
 | |
| The quotient from the division.
 | |
| 
 | |
| @item long long int rem
 | |
| The remainder from the division.
 | |
| @end table
 | |
| 
 | |
| (This is identical to @code{div_t} except that the components are of
 | |
| type @code{long long int} rather than @code{int}.)
 | |
| @end deftp
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment ISO
 | |
| @deftypefun lldiv_t lldiv (long long int @var{numerator}, long long int @var{denominator})
 | |
| The @code{lldiv} function is like the @code{div} function, but the
 | |
| arguments are of type @code{long long int} and the result is returned as
 | |
| a structure of type @code{lldiv_t}.
 | |
| 
 | |
| The @code{lldiv} function was added in @w{ISO C99}.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment inttypes.h
 | |
| @comment ISO
 | |
| @deftp {Data Type} imaxdiv_t
 | |
| This is a structure type used to hold the result returned by the @code{imaxdiv}
 | |
| function.  It has the following members:
 | |
| 
 | |
| @table @code
 | |
| @item intmax_t quot
 | |
| The quotient from the division.
 | |
| 
 | |
| @item intmax_t rem
 | |
| The remainder from the division.
 | |
| @end table
 | |
| 
 | |
| (This is identical to @code{div_t} except that the components are of
 | |
| type @code{intmax_t} rather than @code{int}.)
 | |
| 
 | |
| See @ref{Integers} for a description of the @code{intmax_t} type.
 | |
| 
 | |
| @end deftp
 | |
| 
 | |
| @comment inttypes.h
 | |
| @comment ISO
 | |
| @deftypefun imaxdiv_t imaxdiv (intmax_t @var{numerator}, intmax_t @var{denominator})
 | |
| The @code{imaxdiv} function is like the @code{div} function, but the
 | |
| arguments are of type @code{intmax_t} and the result is returned as
 | |
| a structure of type @code{imaxdiv_t}.
 | |
| 
 | |
| See @ref{Integers} for a description of the @code{intmax_t} type.
 | |
| 
 | |
| The @code{imaxdiv} function was added in @w{ISO C99}.
 | |
| @end deftypefun
 | |
| 
 | |
| 
 | |
| @node Floating Point Numbers
 | |
| @section Floating Point Numbers
 | |
| @cindex floating point
 | |
| @cindex IEEE 754
 | |
| @cindex IEEE floating point
 | |
| 
 | |
| Most computer hardware has support for two different kinds of numbers:
 | |
| integers (@math{@dots{}-3, -2, -1, 0, 1, 2, 3@dots{}}) and
 | |
| floating-point numbers.  Floating-point numbers have three parts: the
 | |
| @dfn{mantissa}, the @dfn{exponent}, and the @dfn{sign bit}.  The real
 | |
| number represented by a floating-point value is given by
 | |
| @tex
 | |
| $(s \mathrel? -1 \mathrel: 1) \cdot 2^e \cdot M$
 | |
| @end tex
 | |
| @ifnottex
 | |
| @math{(s ? -1 : 1) @mul{} 2^e @mul{} M}
 | |
| @end ifnottex
 | |
| where @math{s} is the sign bit, @math{e} the exponent, and @math{M}
 | |
| the mantissa.  @xref{Floating Point Concepts}, for details.  (It is
 | |
| possible to have a different @dfn{base} for the exponent, but all modern
 | |
| hardware uses @math{2}.)
 | |
| 
 | |
| Floating-point numbers can represent a finite subset of the real
 | |
| numbers.  While this subset is large enough for most purposes, it is
 | |
| important to remember that the only reals that can be represented
 | |
| exactly are rational numbers that have a terminating binary expansion
 | |
| shorter than the width of the mantissa.  Even simple fractions such as
 | |
| @math{1/5} can only be approximated by floating point.
 | |
| 
 | |
| Mathematical operations and functions frequently need to produce values
 | |
| that are not representable.  Often these values can be approximated
 | |
| closely enough for practical purposes, but sometimes they can't.
 | |
| Historically there was no way to tell when the results of a calculation
 | |
| were inaccurate.  Modern computers implement the @w{IEEE 754} standard
 | |
| for numerical computations, which defines a framework for indicating to
 | |
| the program when the results of calculation are not trustworthy.  This
 | |
| framework consists of a set of @dfn{exceptions} that indicate why a
 | |
| result could not be represented, and the special values @dfn{infinity}
 | |
| and @dfn{not a number} (NaN).
 | |
| 
 | |
| @node Floating Point Classes
 | |
| @section Floating-Point Number Classification Functions
 | |
| @cindex floating-point classes
 | |
| @cindex classes, floating-point
 | |
| @pindex math.h
 | |
| 
 | |
| @w{ISO C99} defines macros that let you determine what sort of
 | |
| floating-point number a variable holds.
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefn {Macro} int fpclassify (@emph{float-type} @var{x})
 | |
| This is a generic macro which works on all floating-point types and
 | |
| which returns a value of type @code{int}.  The possible values are:
 | |
| 
 | |
| @vtable @code
 | |
| @item FP_NAN
 | |
| The floating-point number @var{x} is ``Not a Number'' (@pxref{Infinity
 | |
| and NaN})
 | |
| @item FP_INFINITE
 | |
| The value of @var{x} is either plus or minus infinity (@pxref{Infinity
 | |
| and NaN})
 | |
| @item FP_ZERO
 | |
| The value of @var{x} is zero.  In floating-point formats like @w{IEEE
 | |
| 754}, where zero can be signed, this value is also returned if
 | |
| @var{x} is negative zero.
 | |
| @item FP_SUBNORMAL
 | |
| Numbers whose absolute value is too small to be represented in the
 | |
| normal format are represented in an alternate, @dfn{denormalized} format
 | |
| (@pxref{Floating Point Concepts}).  This format is less precise but can
 | |
| represent values closer to zero.  @code{fpclassify} returns this value
 | |
| for values of @var{x} in this alternate format.
 | |
| @item FP_NORMAL
 | |
| This value is returned for all other values of @var{x}.  It indicates
 | |
| that there is nothing special about the number.
 | |
| @end vtable
 | |
| 
 | |
| @end deftypefn
 | |
| 
 | |
| @code{fpclassify} is most useful if more than one property of a number
 | |
| must be tested.  There are more specific macros which only test one
 | |
| property at a time.  Generally these macros execute faster than
 | |
| @code{fpclassify}, since there is special hardware support for them.
 | |
| You should therefore use the specific macros whenever possible.
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefn {Macro} int isfinite (@emph{float-type} @var{x})
 | |
| This macro returns a nonzero value if @var{x} is finite: not plus or
 | |
| minus infinity, and not NaN.  It is equivalent to
 | |
| 
 | |
| @smallexample
 | |
| (fpclassify (x) != FP_NAN && fpclassify (x) != FP_INFINITE)
 | |
| @end smallexample
 | |
| 
 | |
| @code{isfinite} is implemented as a macro which accepts any
 | |
| floating-point type.
 | |
| @end deftypefn
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefn {Macro} int isnormal (@emph{float-type} @var{x})
 | |
| This macro returns a nonzero value if @var{x} is finite and normalized.
 | |
| It is equivalent to
 | |
| 
 | |
| @smallexample
 | |
| (fpclassify (x) == FP_NORMAL)
 | |
| @end smallexample
 | |
| @end deftypefn
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefn {Macro} int isnan (@emph{float-type} @var{x})
 | |
| This macro returns a nonzero value if @var{x} is NaN.  It is equivalent
 | |
| to
 | |
| 
 | |
| @smallexample
 | |
| (fpclassify (x) == FP_NAN)
 | |
| @end smallexample
 | |
| @end deftypefn
 | |
| 
 | |
| Another set of floating-point classification functions was provided by
 | |
| BSD.  The GNU C library also supports these functions; however, we
 | |
| recommend that you use the ISO C99 macros in new code.  Those are standard
 | |
| and will be available more widely.  Also, since they are macros, you do
 | |
| not have to worry about the type of their argument.
 | |
| 
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefun int isinf (double @var{x})
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefunx int isinff (float @var{x})
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefunx int isinfl (long double @var{x})
 | |
| This function returns @code{-1} if @var{x} represents negative infinity,
 | |
| @code{1} if @var{x} represents positive infinity, and @code{0} otherwise.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefun int isnan (double @var{x})
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefunx int isnanf (float @var{x})
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefunx int isnanl (long double @var{x})
 | |
| This function returns a nonzero value if @var{x} is a ``not a number''
 | |
| value, and zero otherwise.
 | |
| 
 | |
| @strong{Note:} The @code{isnan} macro defined by @w{ISO C99} overrides
 | |
| the BSD function.  This is normally not a problem, because the two
 | |
| routines behave identically.  However, if you really need to get the BSD
 | |
| function for some reason, you can write
 | |
| 
 | |
| @smallexample
 | |
| (isnan) (x)
 | |
| @end smallexample
 | |
| @end deftypefun
 | |
| 
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefun int finite (double @var{x})
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefunx int finitef (float @var{x})
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefunx int finitel (long double @var{x})
 | |
| This function returns a nonzero value if @var{x} is finite or a ``not a
 | |
| number'' value, and zero otherwise.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefun double infnan (int @var{error})
 | |
| This function is provided for compatibility with BSD.  Its argument is
 | |
| an error code, @code{EDOM} or @code{ERANGE}; @code{infnan} returns the
 | |
| value that a math function would return if it set @code{errno} to that
 | |
| value.  @xref{Math Error Reporting}.  @code{-ERANGE} is also acceptable
 | |
| as an argument, and corresponds to @code{-HUGE_VAL} as a value.
 | |
| 
 | |
| In the BSD library, on certain machines, @code{infnan} raises a fatal
 | |
| signal in all cases.  The GNU library does not do likewise, because that
 | |
| does not fit the @w{ISO C} specification.
 | |
| @end deftypefun
 | |
| 
 | |
| @strong{Portability Note:} The functions listed in this section are BSD
 | |
| extensions.
 | |
| 
 | |
| 
 | |
| @node Floating Point Errors
 | |
| @section Errors in Floating-Point Calculations
 | |
| 
 | |
| @menu
 | |
| * FP Exceptions::               IEEE 754 math exceptions and how to detect them.
 | |
| * Infinity and NaN::            Special values returned by calculations.
 | |
| * Status bit operations::       Checking for exceptions after the fact.
 | |
| * Math Error Reporting::        How the math functions report errors.
 | |
| @end menu
 | |
| 
 | |
| @node FP Exceptions
 | |
| @subsection FP Exceptions
 | |
| @cindex exception
 | |
| @cindex signal
 | |
| @cindex zero divide
 | |
| @cindex division by zero
 | |
| @cindex inexact exception
 | |
| @cindex invalid exception
 | |
| @cindex overflow exception
 | |
| @cindex underflow exception
 | |
| 
 | |
| The @w{IEEE 754} standard defines five @dfn{exceptions} that can occur
 | |
| during a calculation.  Each corresponds to a particular sort of error,
 | |
| such as overflow.
 | |
| 
 | |
| When exceptions occur (when exceptions are @dfn{raised}, in the language
 | |
| of the standard), one of two things can happen.  By default the
 | |
| exception is simply noted in the floating-point @dfn{status word}, and
 | |
| the program continues as if nothing had happened.  The operation
 | |
| produces a default value, which depends on the exception (see the table
 | |
| below).  Your program can check the status word to find out which
 | |
| exceptions happened.
 | |
| 
 | |
| Alternatively, you can enable @dfn{traps} for exceptions.  In that case,
 | |
| when an exception is raised, your program will receive the @code{SIGFPE}
 | |
| signal.  The default action for this signal is to terminate the
 | |
| program.  @xref{Signal Handling}, for how you can change the effect of
 | |
| the signal.
 | |
| 
 | |
| @findex matherr
 | |
| In the System V math library, the user-defined function @code{matherr}
 | |
| is called when certain exceptions occur inside math library functions.
 | |
| However, the Unix98 standard deprecates this interface.  We support it
 | |
| for historical compatibility, but recommend that you do not use it in
 | |
| new programs.
 | |
| 
 | |
| @noindent
 | |
| The exceptions defined in @w{IEEE 754} are:
 | |
| 
 | |
| @table @samp
 | |
| @item Invalid Operation
 | |
| This exception is raised if the given operands are invalid for the
 | |
| operation to be performed.  Examples are
 | |
| (see @w{IEEE 754}, @w{section 7}):
 | |
| @enumerate
 | |
| @item
 | |
| Addition or subtraction: @math{@infinity{} - @infinity{}}.  (But
 | |
| @math{@infinity{} + @infinity{} = @infinity{}}).
 | |
| @item
 | |
| Multiplication: @math{0 @mul{} @infinity{}}.
 | |
| @item
 | |
| Division: @math{0/0} or @math{@infinity{}/@infinity{}}.
 | |
| @item
 | |
| Remainder: @math{x} REM @math{y}, where @math{y} is zero or @math{x} is
 | |
| infinite.
 | |
| @item
 | |
| Square root if the operand is less then zero.  More generally, any
 | |
| mathematical function evaluated outside its domain produces this
 | |
| exception.
 | |
| @item
 | |
| Conversion of a floating-point number to an integer or decimal
 | |
| string, when the number cannot be represented in the target format (due
 | |
| to overflow, infinity, or NaN).
 | |
| @item
 | |
| Conversion of an unrecognizable input string.
 | |
| @item
 | |
| Comparison via predicates involving @math{<} or @math{>}, when one or
 | |
| other of the operands is NaN.  You can prevent this exception by using
 | |
| the unordered comparison functions instead; see @ref{FP Comparison Functions}.
 | |
| @end enumerate
 | |
| 
 | |
| If the exception does not trap, the result of the operation is NaN.
 | |
| 
 | |
| @item Division by Zero
 | |
| This exception is raised when a finite nonzero number is divided
 | |
| by zero.  If no trap occurs the result is either @math{+@infinity{}} or
 | |
| @math{-@infinity{}}, depending on the signs of the operands.
 | |
| 
 | |
| @item Overflow
 | |
| This exception is raised whenever the result cannot be represented
 | |
| as a finite value in the precision format of the destination.  If no trap
 | |
| occurs the result depends on the sign of the intermediate result and the
 | |
| current rounding mode (@w{IEEE 754}, @w{section 7.3}):
 | |
| @enumerate
 | |
| @item
 | |
| Round to nearest carries all overflows to @math{@infinity{}}
 | |
| with the sign of the intermediate result.
 | |
| @item
 | |
| Round toward @math{0} carries all overflows to the largest representable
 | |
| finite number with the sign of the intermediate result.
 | |
| @item
 | |
| Round toward @math{-@infinity{}} carries positive overflows to the
 | |
| largest representable finite number and negative overflows to
 | |
| @math{-@infinity{}}.
 | |
| 
 | |
| @item
 | |
| Round toward @math{@infinity{}} carries negative overflows to the
 | |
| most negative representable finite number and positive overflows
 | |
| to @math{@infinity{}}.
 | |
| @end enumerate
 | |
| 
 | |
| Whenever the overflow exception is raised, the inexact exception is also
 | |
| raised.
 | |
| 
 | |
| @item Underflow
 | |
| The underflow exception is raised when an intermediate result is too
 | |
| small to be calculated accurately, or if the operation's result rounded
 | |
| to the destination precision is too small to be normalized.
 | |
| 
 | |
| When no trap is installed for the underflow exception, underflow is
 | |
| signaled (via the underflow flag) only when both tininess and loss of
 | |
| accuracy have been detected.  If no trap handler is installed the
 | |
| operation continues with an imprecise small value, or zero if the
 | |
| destination precision cannot hold the small exact result.
 | |
| 
 | |
| @item Inexact
 | |
| This exception is signalled if a rounded result is not exact (such as
 | |
| when calculating the square root of two) or a result overflows without
 | |
| an overflow trap.
 | |
| @end table
 | |
| 
 | |
| @node Infinity and NaN
 | |
| @subsection Infinity and NaN
 | |
| @cindex infinity
 | |
| @cindex not a number
 | |
| @cindex NaN
 | |
| 
 | |
| @w{IEEE 754} floating point numbers can represent positive or negative
 | |
| infinity, and @dfn{NaN} (not a number).  These three values arise from
 | |
| calculations whose result is undefined or cannot be represented
 | |
| accurately.  You can also deliberately set a floating-point variable to
 | |
| any of them, which is sometimes useful.  Some examples of calculations
 | |
| that produce infinity or NaN:
 | |
| 
 | |
| @ifnottex
 | |
| @smallexample
 | |
| @math{1/0 = @infinity{}}
 | |
| @math{log (0) = -@infinity{}}
 | |
| @math{sqrt (-1) = NaN}
 | |
| @end smallexample
 | |
| @end ifnottex
 | |
| @tex
 | |
| $${1\over0} = \infty$$
 | |
| $$\log 0 = -\infty$$
 | |
| $$\sqrt{-1} = \hbox{NaN}$$
 | |
| @end tex
 | |
| 
 | |
| When a calculation produces any of these values, an exception also
 | |
| occurs; see @ref{FP Exceptions}.
 | |
| 
 | |
| The basic operations and math functions all accept infinity and NaN and
 | |
| produce sensible output.  Infinities propagate through calculations as
 | |
| one would expect: for example, @math{2 + @infinity{} = @infinity{}},
 | |
| @math{4/@infinity{} = 0}, atan @math{(@infinity{}) = @pi{}/2}.  NaN, on
 | |
| the other hand, infects any calculation that involves it.  Unless the
 | |
| calculation would produce the same result no matter what real value
 | |
| replaced NaN, the result is NaN.
 | |
| 
 | |
| In comparison operations, positive infinity is larger than all values
 | |
| except itself and NaN, and negative infinity is smaller than all values
 | |
| except itself and NaN.  NaN is @dfn{unordered}: it is not equal to,
 | |
| greater than, or less than anything, @emph{including itself}. @code{x ==
 | |
| x} is false if the value of @code{x} is NaN.  You can use this to test
 | |
| whether a value is NaN or not, but the recommended way to test for NaN
 | |
| is with the @code{isnan} function (@pxref{Floating Point Classes}).  In
 | |
| addition, @code{<}, @code{>}, @code{<=}, and @code{>=} will raise an
 | |
| exception when applied to NaNs.
 | |
| 
 | |
| @file{math.h} defines macros that allow you to explicitly set a variable
 | |
| to infinity or NaN.
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypevr Macro float INFINITY
 | |
| An expression representing positive infinity.  It is equal to the value
 | |
| produced  by mathematical operations like @code{1.0 / 0.0}.
 | |
| @code{-INFINITY} represents negative infinity.
 | |
| 
 | |
| You can test whether a floating-point value is infinite by comparing it
 | |
| to this macro.  However, this is not recommended; you should use the
 | |
| @code{isfinite} macro instead.  @xref{Floating Point Classes}.
 | |
| 
 | |
| This macro was introduced in the @w{ISO C99} standard.
 | |
| @end deftypevr
 | |
| 
 | |
| @comment math.h
 | |
| @comment GNU
 | |
| @deftypevr Macro float NAN
 | |
| An expression representing a value which is ``not a number''.  This
 | |
| macro is a GNU extension, available only on machines that support the
 | |
| ``not a number'' value---that is to say, on all machines that support
 | |
| IEEE floating point.
 | |
| 
 | |
| You can use @samp{#ifdef NAN} to test whether the machine supports
 | |
| NaN.  (Of course, you must arrange for GNU extensions to be visible,
 | |
| such as by defining @code{_GNU_SOURCE}, and then you must include
 | |
| @file{math.h}.)
 | |
| @end deftypevr
 | |
| 
 | |
| @w{IEEE 754} also allows for another unusual value: negative zero.  This
 | |
| value is produced when you divide a positive number by negative
 | |
| infinity, or when a negative result is smaller than the limits of
 | |
| representation.  Negative zero behaves identically to zero in all
 | |
| calculations, unless you explicitly test the sign bit with
 | |
| @code{signbit} or @code{copysign}.
 | |
| 
 | |
| @node Status bit operations
 | |
| @subsection Examining the FPU status word
 | |
| 
 | |
| @w{ISO C99} defines functions to query and manipulate the
 | |
| floating-point status word.  You can use these functions to check for
 | |
| untrapped exceptions when it's convenient, rather than worrying about
 | |
| them in the middle of a calculation.
 | |
| 
 | |
| These constants represent the various @w{IEEE 754} exceptions.  Not all
 | |
| FPUs report all the different exceptions.  Each constant is defined if
 | |
| and only if the FPU you are compiling for supports that exception, so
 | |
| you can test for FPU support with @samp{#ifdef}.  They are defined in
 | |
| @file{fenv.h}.
 | |
| 
 | |
| @vtable @code
 | |
| @comment fenv.h
 | |
| @comment ISO
 | |
| @item FE_INEXACT
 | |
|  The inexact exception.
 | |
| @comment fenv.h
 | |
| @comment ISO
 | |
| @item FE_DIVBYZERO
 | |
|  The divide by zero exception.
 | |
| @comment fenv.h
 | |
| @comment ISO
 | |
| @item FE_UNDERFLOW
 | |
|  The underflow exception.
 | |
| @comment fenv.h
 | |
| @comment ISO
 | |
| @item FE_OVERFLOW
 | |
|  The overflow exception.
 | |
| @comment fenv.h
 | |
| @comment ISO
 | |
| @item FE_INVALID
 | |
|  The invalid exception.
 | |
| @end vtable
 | |
| 
 | |
| The macro @code{FE_ALL_EXCEPT} is the bitwise OR of all exception macros
 | |
| which are supported by the FP implementation.
 | |
| 
 | |
| These functions allow you to clear exception flags, test for exceptions,
 | |
| and save and restore the set of exceptions flagged.
 | |
| 
 | |
| @comment fenv.h
 | |
| @comment ISO
 | |
| @deftypefun int feclearexcept (int @var{excepts})
 | |
| This function clears all of the supported exception flags indicated by
 | |
| @var{excepts}.
 | |
| 
 | |
| The function returns zero in case the operation was successful, a
 | |
| non-zero value otherwise.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment fenv.h
 | |
| @comment ISO
 | |
| @deftypefun int feraiseexcept (int @var{excepts})
 | |
| This function raises the supported exceptions indicated by
 | |
| @var{excepts}.  If more than one exception bit in @var{excepts} is set
 | |
| the order in which the exceptions are raised is undefined except that
 | |
| overflow (@code{FE_OVERFLOW}) or underflow (@code{FE_UNDERFLOW}) are
 | |
| raised before inexact (@code{FE_INEXACT}).  Whether for overflow or
 | |
| underflow the inexact exception is also raised is also implementation
 | |
| dependent.
 | |
| 
 | |
| The function returns zero in case the operation was successful, a
 | |
| non-zero value otherwise.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment fenv.h
 | |
| @comment ISO
 | |
| @deftypefun int fetestexcept (int @var{excepts})
 | |
| Test whether the exception flags indicated by the parameter @var{except}
 | |
| are currently set.  If any of them are, a nonzero value is returned
 | |
| which specifies which exceptions are set.  Otherwise the result is zero.
 | |
| @end deftypefun
 | |
| 
 | |
| To understand these functions, imagine that the status word is an
 | |
| integer variable named @var{status}.  @code{feclearexcept} is then
 | |
| equivalent to @samp{status &= ~excepts} and @code{fetestexcept} is
 | |
| equivalent to @samp{(status & excepts)}.  The actual implementation may
 | |
| be very different, of course.
 | |
| 
 | |
| Exception flags are only cleared when the program explicitly requests it,
 | |
| by calling @code{feclearexcept}.  If you want to check for exceptions
 | |
| from a set of calculations, you should clear all the flags first.  Here
 | |
| is a simple example of the way to use @code{fetestexcept}:
 | |
| 
 | |
| @smallexample
 | |
| @{
 | |
|   double f;
 | |
|   int raised;
 | |
|   feclearexcept (FE_ALL_EXCEPT);
 | |
|   f = compute ();
 | |
|   raised = fetestexcept (FE_OVERFLOW | FE_INVALID);
 | |
|   if (raised & FE_OVERFLOW) @{ /* ... */ @}
 | |
|   if (raised & FE_INVALID) @{ /* ... */ @}
 | |
|   /* ... */
 | |
| @}
 | |
| @end smallexample
 | |
| 
 | |
| You cannot explicitly set bits in the status word.  You can, however,
 | |
| save the entire status word and restore it later.  This is done with the
 | |
| following functions:
 | |
| 
 | |
| @comment fenv.h
 | |
| @comment ISO
 | |
| @deftypefun int fegetexceptflag (fexcept_t *@var{flagp}, int @var{excepts})
 | |
| This function stores in the variable pointed to by @var{flagp} an
 | |
| implementation-defined value representing the current setting of the
 | |
| exception flags indicated by @var{excepts}.
 | |
| 
 | |
| The function returns zero in case the operation was successful, a
 | |
| non-zero value otherwise.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment fenv.h
 | |
| @comment ISO
 | |
| @deftypefun int fesetexceptflag (const fexcept_t *@var{flagp}, int
 | |
| @var{excepts})
 | |
| This function restores the flags for the exceptions indicated by
 | |
| @var{excepts} to the values stored in the variable pointed to by
 | |
| @var{flagp}.
 | |
| 
 | |
| The function returns zero in case the operation was successful, a
 | |
| non-zero value otherwise.
 | |
| @end deftypefun
 | |
| 
 | |
| Note that the value stored in @code{fexcept_t} bears no resemblance to
 | |
| the bit mask returned by @code{fetestexcept}.  The type may not even be
 | |
| an integer.  Do not attempt to modify an @code{fexcept_t} variable.
 | |
| 
 | |
| @node Math Error Reporting
 | |
| @subsection Error Reporting by Mathematical Functions
 | |
| @cindex errors, mathematical
 | |
| @cindex domain error
 | |
| @cindex range error
 | |
| 
 | |
| Many of the math functions are defined only over a subset of the real or
 | |
| complex numbers.  Even if they are mathematically defined, their result
 | |
| may be larger or smaller than the range representable by their return
 | |
| type.  These are known as @dfn{domain errors}, @dfn{overflows}, and
 | |
| @dfn{underflows}, respectively.  Math functions do several things when
 | |
| one of these errors occurs.  In this manual we will refer to the
 | |
| complete response as @dfn{signalling} a domain error, overflow, or
 | |
| underflow.
 | |
| 
 | |
| When a math function suffers a domain error, it raises the invalid
 | |
| exception and returns NaN.  It also sets @var{errno} to @code{EDOM};
 | |
| this is for compatibility with old systems that do not support @w{IEEE
 | |
| 754} exception handling.  Likewise, when overflow occurs, math
 | |
| functions raise the overflow exception and return @math{@infinity{}} or
 | |
| @math{-@infinity{}} as appropriate.  They also set @var{errno} to
 | |
| @code{ERANGE}.  When underflow occurs, the underflow exception is
 | |
| raised, and zero (appropriately signed) is returned.  @var{errno} may be
 | |
| set to @code{ERANGE}, but this is not guaranteed.
 | |
| 
 | |
| Some of the math functions are defined mathematically to result in a
 | |
| complex value over parts of their domains.  The most familiar example of
 | |
| this is taking the square root of a negative number.  The complex math
 | |
| functions, such as @code{csqrt}, will return the appropriate complex value
 | |
| in this case.  The real-valued functions, such as @code{sqrt}, will
 | |
| signal a domain error.
 | |
| 
 | |
| Some older hardware does not support infinities.  On that hardware,
 | |
| overflows instead return a particular very large number (usually the
 | |
| largest representable number).  @file{math.h} defines macros you can use
 | |
| to test for overflow on both old and new hardware.
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypevr Macro double HUGE_VAL
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypevrx Macro float HUGE_VALF
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypevrx Macro {long double} HUGE_VALL
 | |
| An expression representing a particular very large number.  On machines
 | |
| that use @w{IEEE 754} floating point format, @code{HUGE_VAL} is infinity.
 | |
| On other machines, it's typically the largest positive number that can
 | |
| be represented.
 | |
| 
 | |
| Mathematical functions return the appropriately typed version of
 | |
| @code{HUGE_VAL} or @code{@minus{}HUGE_VAL} when the result is too large
 | |
| to be represented.
 | |
| @end deftypevr
 | |
| 
 | |
| @node Rounding
 | |
| @section Rounding Modes
 | |
| 
 | |
| Floating-point calculations are carried out internally with extra
 | |
| precision, and then rounded to fit into the destination type.  This
 | |
| ensures that results are as precise as the input data.  @w{IEEE 754}
 | |
| defines four possible rounding modes:
 | |
| 
 | |
| @table @asis
 | |
| @item Round to nearest.
 | |
| This is the default mode.  It should be used unless there is a specific
 | |
| need for one of the others.  In this mode results are rounded to the
 | |
| nearest representable value.  If the result is midway between two
 | |
| representable values, the even representable is chosen. @dfn{Even} here
 | |
| means the lowest-order bit is zero.  This rounding mode prevents
 | |
| statistical bias and guarantees numeric stability: round-off errors in a
 | |
| lengthy calculation will remain smaller than half of @code{FLT_EPSILON}.
 | |
| 
 | |
| @c @item Round toward @math{+@infinity{}}
 | |
| @item Round toward plus Infinity.
 | |
| All results are rounded to the smallest representable value
 | |
| which is greater than the result.
 | |
| 
 | |
| @c @item Round toward @math{-@infinity{}}
 | |
| @item Round toward minus Infinity.
 | |
| All results are rounded to the largest representable value which is less
 | |
| than the result.
 | |
| 
 | |
| @item Round toward zero.
 | |
| All results are rounded to the largest representable value whose
 | |
| magnitude is less than that of the result.  In other words, if the
 | |
| result is negative it is rounded up; if it is positive, it is rounded
 | |
| down.
 | |
| @end table
 | |
| 
 | |
| @noindent
 | |
| @file{fenv.h} defines constants which you can use to refer to the
 | |
| various rounding modes.  Each one will be defined if and only if the FPU
 | |
| supports the corresponding rounding mode.
 | |
| 
 | |
| @table @code
 | |
| @comment fenv.h
 | |
| @comment ISO
 | |
| @vindex FE_TONEAREST
 | |
| @item FE_TONEAREST
 | |
| Round to nearest.
 | |
| 
 | |
| @comment fenv.h
 | |
| @comment ISO
 | |
| @vindex FE_UPWARD
 | |
| @item FE_UPWARD
 | |
| Round toward @math{+@infinity{}}.
 | |
| 
 | |
| @comment fenv.h
 | |
| @comment ISO
 | |
| @vindex FE_DOWNWARD
 | |
| @item FE_DOWNWARD
 | |
| Round toward @math{-@infinity{}}.
 | |
| 
 | |
| @comment fenv.h
 | |
| @comment ISO
 | |
| @vindex FE_TOWARDZERO
 | |
| @item FE_TOWARDZERO
 | |
| Round toward zero.
 | |
| @end table
 | |
| 
 | |
| Underflow is an unusual case.  Normally, @w{IEEE 754} floating point
 | |
| numbers are always normalized (@pxref{Floating Point Concepts}).
 | |
| Numbers smaller than @math{2^r} (where @math{r} is the minimum exponent,
 | |
| @code{FLT_MIN_RADIX-1} for @var{float}) cannot be represented as
 | |
| normalized numbers.  Rounding all such numbers to zero or @math{2^r}
 | |
| would cause some algorithms to fail at 0.  Therefore, they are left in
 | |
| denormalized form.  That produces loss of precision, since some bits of
 | |
| the mantissa are stolen to indicate the decimal point.
 | |
| 
 | |
| If a result is too small to be represented as a denormalized number, it
 | |
| is rounded to zero.  However, the sign of the result is preserved; if
 | |
| the calculation was negative, the result is @dfn{negative zero}.
 | |
| Negative zero can also result from some operations on infinity, such as
 | |
| @math{4/-@infinity{}}.  Negative zero behaves identically to zero except
 | |
| when the @code{copysign} or @code{signbit} functions are used to check
 | |
| the sign bit directly.
 | |
| 
 | |
| At any time one of the above four rounding modes is selected.  You can
 | |
| find out which one with this function:
 | |
| 
 | |
| @comment fenv.h
 | |
| @comment ISO
 | |
| @deftypefun int fegetround (void)
 | |
| Returns the currently selected rounding mode, represented by one of the
 | |
| values of the defined rounding mode macros.
 | |
| @end deftypefun
 | |
| 
 | |
| @noindent
 | |
| To change the rounding mode, use this function:
 | |
| 
 | |
| @comment fenv.h
 | |
| @comment ISO
 | |
| @deftypefun int fesetround (int @var{round})
 | |
| Changes the currently selected rounding mode to @var{round}.  If
 | |
| @var{round} does not correspond to one of the supported rounding modes
 | |
| nothing is changed.  @code{fesetround} returns a nonzero value if it
 | |
| changed the rounding mode, zero if the mode is not supported.
 | |
| @end deftypefun
 | |
| 
 | |
| You should avoid changing the rounding mode if possible.  It can be an
 | |
| expensive operation; also, some hardware requires you to compile your
 | |
| program differently for it to work.  The resulting code may run slower.
 | |
| See your compiler documentation for details.
 | |
| @c This section used to claim that functions existed to round one number
 | |
| @c in a specific fashion.  I can't find any functions in the library
 | |
| @c that do that. -zw
 | |
| 
 | |
| @node Control Functions
 | |
| @section Floating-Point Control Functions
 | |
| 
 | |
| @w{IEEE 754} floating-point implementations allow the programmer to
 | |
| decide whether traps will occur for each of the exceptions, by setting
 | |
| bits in the @dfn{control word}.  In C, traps result in the program
 | |
| receiving the @code{SIGFPE} signal; see @ref{Signal Handling}.
 | |
| 
 | |
| @strong{Note:} @w{IEEE 754} says that trap handlers are given details of
 | |
| the exceptional situation, and can set the result value.  C signals do
 | |
| not provide any mechanism to pass this information back and forth.
 | |
| Trapping exceptions in C is therefore not very useful.
 | |
| 
 | |
| It is sometimes necessary to save the state of the floating-point unit
 | |
| while you perform some calculation.  The library provides functions
 | |
| which save and restore the exception flags, the set of exceptions that
 | |
| generate traps, and the rounding mode.  This information is known as the
 | |
| @dfn{floating-point environment}.
 | |
| 
 | |
| The functions to save and restore the floating-point environment all use
 | |
| a variable of type @code{fenv_t} to store information.  This type is
 | |
| defined in @file{fenv.h}.  Its size and contents are
 | |
| implementation-defined.  You should not attempt to manipulate a variable
 | |
| of this type directly.
 | |
| 
 | |
| To save the state of the FPU, use one of these functions:
 | |
| 
 | |
| @comment fenv.h
 | |
| @comment ISO
 | |
| @deftypefun int fegetenv (fenv_t *@var{envp})
 | |
| Store the floating-point environment in the variable pointed to by
 | |
| @var{envp}.
 | |
| 
 | |
| The function returns zero in case the operation was successful, a
 | |
| non-zero value otherwise.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment fenv.h
 | |
| @comment ISO
 | |
| @deftypefun int feholdexcept (fenv_t *@var{envp})
 | |
| Store the current floating-point environment in the object pointed to by
 | |
| @var{envp}.  Then clear all exception flags, and set the FPU to trap no
 | |
| exceptions.  Not all FPUs support trapping no exceptions; if
 | |
| @code{feholdexcept} cannot set this mode, it returns nonzero value.  If it
 | |
| succeeds, it returns zero.
 | |
| @end deftypefun
 | |
| 
 | |
| The functions which restore the floating-point environment can take these
 | |
| kinds of arguments:
 | |
| 
 | |
| @itemize @bullet
 | |
| @item
 | |
| Pointers to @code{fenv_t} objects, which were initialized previously by a
 | |
| call to @code{fegetenv} or @code{feholdexcept}.
 | |
| @item
 | |
| @vindex FE_DFL_ENV
 | |
| The special macro @code{FE_DFL_ENV} which represents the floating-point
 | |
| environment as it was available at program start.
 | |
| @item
 | |
| Implementation defined macros with names starting with @code{FE_} and
 | |
| having type @code{fenv_t *}.
 | |
| 
 | |
| @vindex FE_NOMASK_ENV
 | |
| If possible, the GNU C Library defines a macro @code{FE_NOMASK_ENV}
 | |
| which represents an environment where every exception raised causes a
 | |
| trap to occur.  You can test for this macro using @code{#ifdef}.  It is
 | |
| only defined if @code{_GNU_SOURCE} is defined.
 | |
| 
 | |
| Some platforms might define other predefined environments.
 | |
| @end itemize
 | |
| 
 | |
| @noindent
 | |
| To set the floating-point environment, you can use either of these
 | |
| functions:
 | |
| 
 | |
| @comment fenv.h
 | |
| @comment ISO
 | |
| @deftypefun int fesetenv (const fenv_t *@var{envp})
 | |
| Set the floating-point environment to that described by @var{envp}.
 | |
| 
 | |
| The function returns zero in case the operation was successful, a
 | |
| non-zero value otherwise.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment fenv.h
 | |
| @comment ISO
 | |
| @deftypefun int feupdateenv (const fenv_t *@var{envp})
 | |
| Like @code{fesetenv}, this function sets the floating-point environment
 | |
| to that described by @var{envp}.  However, if any exceptions were
 | |
| flagged in the status word before @code{feupdateenv} was called, they
 | |
| remain flagged after the call.  In other words, after @code{feupdateenv}
 | |
| is called, the status word is the bitwise OR of the previous status word
 | |
| and the one saved in @var{envp}.
 | |
| 
 | |
| The function returns zero in case the operation was successful, a
 | |
| non-zero value otherwise.
 | |
| @end deftypefun
 | |
| 
 | |
| @noindent
 | |
| To control for individual exceptions if raising them causes a trap to
 | |
| occur, you can use the following two functions.
 | |
| 
 | |
| @strong{Portability Note:} These functions are all GNU extensions.
 | |
| 
 | |
| @comment fenv.h
 | |
| @comment GNU
 | |
| @deftypefun int feenableexcept (int @var{excepts})
 | |
| This functions enables traps for each of the exceptions as indicated by
 | |
| the parameter @var{except}.  The individual excepetions are described in
 | |
| @ref{Status bit operations}.  Only the specified exceptions are
 | |
| enabled, the status of the other exceptions is not changed.
 | |
| 
 | |
| The function returns the previous enabled exceptions in case the
 | |
| operation was successful, @code{-1} otherwise.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment fenv.h
 | |
| @comment GNU
 | |
| @deftypefun int fedisableexcept (int @var{excepts})
 | |
| This functions disables traps for each of the exceptions as indicated by
 | |
| the parameter @var{except}.  The individual excepetions are described in
 | |
| @ref{Status bit operations}.  Only the specified exceptions are
 | |
| disabled, the status of the other exceptions is not changed.
 | |
| 
 | |
| The function returns the previous enabled exceptions in case the
 | |
| operation was successful, @code{-1} otherwise.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment fenv.h
 | |
| @comment GNU
 | |
| @deftypefun int fegetexcept (int @var{excepts})
 | |
| The function returns a bitmask of all currently enabled exceptions.  It
 | |
| returns @code{-1} in case of failure.
 | |
| @end deftypefun
 | |
| 
 | |
| @node Arithmetic Functions
 | |
| @section Arithmetic Functions
 | |
| 
 | |
| The C library provides functions to do basic operations on
 | |
| floating-point numbers.  These include absolute value, maximum and minimum,
 | |
| normalization, bit twiddling, rounding, and a few others.
 | |
| 
 | |
| @menu
 | |
| * Absolute Value::              Absolute values of integers and floats.
 | |
| * Normalization Functions::     Extracting exponents and putting them back.
 | |
| * Rounding Functions::          Rounding floats to integers.
 | |
| * Remainder Functions::         Remainders on division, precisely defined.
 | |
| * FP Bit Twiddling::            Sign bit adjustment.  Adding epsilon.
 | |
| * FP Comparison Functions::     Comparisons without risk of exceptions.
 | |
| * Misc FP Arithmetic::          Max, min, positive difference, multiply-add.
 | |
| @end menu
 | |
| 
 | |
| @node Absolute Value
 | |
| @subsection Absolute Value
 | |
| @cindex absolute value functions
 | |
| 
 | |
| These functions are provided for obtaining the @dfn{absolute value} (or
 | |
| @dfn{magnitude}) of a number.  The absolute value of a real number
 | |
| @var{x} is @var{x} if @var{x} is positive, @minus{}@var{x} if @var{x} is
 | |
| negative.  For a complex number @var{z}, whose real part is @var{x} and
 | |
| whose imaginary part is @var{y}, the absolute value is @w{@code{sqrt
 | |
| (@var{x}*@var{x} + @var{y}*@var{y})}}.
 | |
| 
 | |
| @pindex math.h
 | |
| @pindex stdlib.h
 | |
| Prototypes for @code{abs}, @code{labs} and @code{llabs} are in @file{stdlib.h};
 | |
| @code{imaxabs} is declared in @file{inttypes.h};
 | |
| @code{fabs}, @code{fabsf} and @code{fabsl} are declared in @file{math.h}.
 | |
| @code{cabs}, @code{cabsf} and @code{cabsl} are declared in @file{complex.h}.
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment ISO
 | |
| @deftypefun int abs (int @var{number})
 | |
| @comment stdlib.h
 | |
| @comment ISO
 | |
| @deftypefunx {long int} labs (long int @var{number})
 | |
| @comment stdlib.h
 | |
| @comment ISO
 | |
| @deftypefunx {long long int} llabs (long long int @var{number})
 | |
| @comment inttypes.h
 | |
| @comment ISO
 | |
| @deftypefunx intmax_t imaxabs (intmax_t @var{number})
 | |
| These functions return the absolute value of @var{number}.
 | |
| 
 | |
| Most computers use a two's complement integer representation, in which
 | |
| the absolute value of @code{INT_MIN} (the smallest possible @code{int})
 | |
| cannot be represented; thus, @w{@code{abs (INT_MIN)}} is not defined.
 | |
| 
 | |
| @code{llabs} and @code{imaxdiv} are new to @w{ISO C99}.
 | |
| 
 | |
| See @ref{Integers} for a description of the @code{intmax_t} type.
 | |
| 
 | |
| @end deftypefun
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefun double fabs (double @var{number})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx float fabsf (float @var{number})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx {long double} fabsl (long double @var{number})
 | |
| This function returns the absolute value of the floating-point number
 | |
| @var{number}.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment complex.h
 | |
| @comment ISO
 | |
| @deftypefun double cabs (complex double @var{z})
 | |
| @comment complex.h
 | |
| @comment ISO
 | |
| @deftypefunx float cabsf (complex float @var{z})
 | |
| @comment complex.h
 | |
| @comment ISO
 | |
| @deftypefunx {long double} cabsl (complex long double @var{z})
 | |
| These functions return the absolute  value of the complex number @var{z}
 | |
| (@pxref{Complex Numbers}).  The absolute value of a complex number is:
 | |
| 
 | |
| @smallexample
 | |
| sqrt (creal (@var{z}) * creal (@var{z}) + cimag (@var{z}) * cimag (@var{z}))
 | |
| @end smallexample
 | |
| 
 | |
| This function should always be used instead of the direct formula
 | |
| because it takes special care to avoid losing precision.  It may also
 | |
| take advantage of hardware support for this operation. See @code{hypot}
 | |
| in @ref{Exponents and Logarithms}.
 | |
| @end deftypefun
 | |
| 
 | |
| @node Normalization Functions
 | |
| @subsection Normalization Functions
 | |
| @cindex normalization functions (floating-point)
 | |
| 
 | |
| The functions described in this section are primarily provided as a way
 | |
| to efficiently perform certain low-level manipulations on floating point
 | |
| numbers that are represented internally using a binary radix;
 | |
| see @ref{Floating Point Concepts}.  These functions are required to
 | |
| have equivalent behavior even if the representation does not use a radix
 | |
| of 2, but of course they are unlikely to be particularly efficient in
 | |
| those cases.
 | |
| 
 | |
| @pindex math.h
 | |
| All these functions are declared in @file{math.h}.
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefun double frexp (double @var{value}, int *@var{exponent})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx float frexpf (float @var{value}, int *@var{exponent})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx {long double} frexpl (long double @var{value}, int *@var{exponent})
 | |
| These functions are used to split the number @var{value}
 | |
| into a normalized fraction and an exponent.
 | |
| 
 | |
| If the argument @var{value} is not zero, the return value is @var{value}
 | |
| times a power of two, and is always in the range 1/2 (inclusive) to 1
 | |
| (exclusive).  The corresponding exponent is stored in
 | |
| @code{*@var{exponent}}; the return value multiplied by 2 raised to this
 | |
| exponent equals the original number @var{value}.
 | |
| 
 | |
| For example, @code{frexp (12.8, &exponent)} returns @code{0.8} and
 | |
| stores @code{4} in @code{exponent}.
 | |
| 
 | |
| If @var{value} is zero, then the return value is zero and
 | |
| zero is stored in @code{*@var{exponent}}.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefun double ldexp (double @var{value}, int @var{exponent})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx float ldexpf (float @var{value}, int @var{exponent})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx {long double} ldexpl (long double @var{value}, int @var{exponent})
 | |
| These functions return the result of multiplying the floating-point
 | |
| number @var{value} by 2 raised to the power @var{exponent}.  (It can
 | |
| be used to reassemble floating-point numbers that were taken apart
 | |
| by @code{frexp}.)
 | |
| 
 | |
| For example, @code{ldexp (0.8, 4)} returns @code{12.8}.
 | |
| @end deftypefun
 | |
| 
 | |
| The following functions, which come from BSD, provide facilities
 | |
| equivalent to those of @code{ldexp} and @code{frexp}.
 | |
| 
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefun double logb (double @var{x})
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefunx float logbf (float @var{x})
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefunx {long double} logbl (long double @var{x})
 | |
| These functions return the integer part of the base-2 logarithm of
 | |
| @var{x}, an integer value represented in type @code{double}.  This is
 | |
| the highest integer power of @code{2} contained in @var{x}.  The sign of
 | |
| @var{x} is ignored.  For example, @code{logb (3.5)} is @code{1.0} and
 | |
| @code{logb (4.0)} is @code{2.0}.
 | |
| 
 | |
| When @code{2} raised to this power is divided into @var{x}, it gives a
 | |
| quotient between @code{1} (inclusive) and @code{2} (exclusive).
 | |
| 
 | |
| If @var{x} is zero, the return value is minus infinity if the machine
 | |
| supports infinities, and a very small number if it does not.  If @var{x}
 | |
| is infinity, the return value is infinity.
 | |
| 
 | |
| For finite @var{x}, the value returned by @code{logb} is one less than
 | |
| the value that @code{frexp} would store into @code{*@var{exponent}}.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefun double scalb (double @var{value}, int @var{exponent})
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefunx float scalbf (float @var{value}, int @var{exponent})
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefunx {long double} scalbl (long double @var{value}, int @var{exponent})
 | |
| The @code{scalb} function is the BSD name for @code{ldexp}.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefun {long long int} scalbn (double @var{x}, int n)
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefunx {long long int} scalbnf (float @var{x}, int n)
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefunx {long long int} scalbnl (long double @var{x}, int n)
 | |
| @code{scalbn} is identical to @code{scalb}, except that the exponent
 | |
| @var{n} is an @code{int} instead of a floating-point number.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefun {long long int} scalbln (double @var{x}, long int n)
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefunx {long long int} scalblnf (float @var{x}, long int n)
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefunx {long long int} scalblnl (long double @var{x}, long int n)
 | |
| @code{scalbln} is identical to @code{scalb}, except that the exponent
 | |
| @var{n} is a @code{long int} instead of a floating-point number.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefun {long long int} significand (double @var{x})
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefunx {long long int} significandf (float @var{x})
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefunx {long long int} significandl (long double @var{x})
 | |
| @code{significand} returns the mantissa of @var{x} scaled to the range
 | |
| @math{[1, 2)}.
 | |
| It is equivalent to @w{@code{scalb (@var{x}, (double) -ilogb (@var{x}))}}.
 | |
| 
 | |
| This function exists mainly for use in certain standardized tests
 | |
| of @w{IEEE 754} conformance.
 | |
| @end deftypefun
 | |
| 
 | |
| @node Rounding Functions
 | |
| @subsection Rounding Functions
 | |
| @cindex converting floats to integers
 | |
| 
 | |
| @pindex math.h
 | |
| The functions listed here perform operations such as rounding and
 | |
| truncation of floating-point values. Some of these functions convert
 | |
| floating point numbers to integer values.  They are all declared in
 | |
| @file{math.h}.
 | |
| 
 | |
| You can also convert floating-point numbers to integers simply by
 | |
| casting them to @code{int}.  This discards the fractional part,
 | |
| effectively rounding towards zero.  However, this only works if the
 | |
| result can actually be represented as an @code{int}---for very large
 | |
| numbers, this is impossible.  The functions listed here return the
 | |
| result as a @code{double} instead to get around this problem.
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefun double ceil (double @var{x})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx float ceilf (float @var{x})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx {long double} ceill (long double @var{x})
 | |
| These functions round @var{x} upwards to the nearest integer,
 | |
| returning that value as a @code{double}.  Thus, @code{ceil (1.5)}
 | |
| is @code{2.0}.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefun double floor (double @var{x})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx float floorf (float @var{x})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx {long double} floorl (long double @var{x})
 | |
| These functions round @var{x} downwards to the nearest
 | |
| integer, returning that value as a @code{double}.  Thus, @code{floor
 | |
| (1.5)} is @code{1.0} and @code{floor (-1.5)} is @code{-2.0}.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefun double trunc (double @var{x})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx float truncf (float @var{x})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx {long double} truncl (long double @var{x})
 | |
| The @code{trunc} functions round @var{x} towards zero to the nearest
 | |
| integer (returned in floating-point format).  Thus, @code{trunc (1.5)}
 | |
| is @code{1.0} and @code{trunc (-1.5)} is @code{-1.0}.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefun double rint (double @var{x})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx float rintf (float @var{x})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx {long double} rintl (long double @var{x})
 | |
| These functions round @var{x} to an integer value according to the
 | |
| current rounding mode.  @xref{Floating Point Parameters}, for
 | |
| information about the various rounding modes.  The default
 | |
| rounding mode is to round to the nearest integer; some machines
 | |
| support other modes, but round-to-nearest is always used unless
 | |
| you explicitly select another.
 | |
| 
 | |
| If @var{x} was not initially an integer, these functions raise the
 | |
| inexact exception.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefun double nearbyint (double @var{x})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx float nearbyintf (float @var{x})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx {long double} nearbyintl (long double @var{x})
 | |
| These functions return the same value as the @code{rint} functions, but
 | |
| do not raise the inexact exception if @var{x} is not an integer.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefun double round (double @var{x})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx float roundf (float @var{x})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx {long double} roundl (long double @var{x})
 | |
| These functions are similar to @code{rint}, but they round halfway
 | |
| cases away from zero instead of to the nearest even integer.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefun {long int} lrint (double @var{x})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx {long int} lrintf (float @var{x})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx {long int} lrintl (long double @var{x})
 | |
| These functions are just like @code{rint}, but they return a
 | |
| @code{long int} instead of a floating-point number.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefun {long long int} llrint (double @var{x})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx {long long int} llrintf (float @var{x})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx {long long int} llrintl (long double @var{x})
 | |
| These functions are just like @code{rint}, but they return a
 | |
| @code{long long int} instead of a floating-point number.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefun {long int} lround (double @var{x})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx {long int} lroundf (float @var{x})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx {long int} lroundl (long double @var{x})
 | |
| These functions are just like @code{round}, but they return a
 | |
| @code{long int} instead of a floating-point number.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefun {long long int} llround (double @var{x})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx {long long int} llroundf (float @var{x})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx {long long int} llroundl (long double @var{x})
 | |
| These functions are just like @code{round}, but they return a
 | |
| @code{long long int} instead of a floating-point number.
 | |
| @end deftypefun
 | |
| 
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefun double modf (double @var{value}, double *@var{integer-part})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx float modff (float @var{value}, float *@var{integer-part})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx {long double} modfl (long double @var{value}, long double *@var{integer-part})
 | |
| These functions break the argument @var{value} into an integer part and a
 | |
| fractional part (between @code{-1} and @code{1}, exclusive).  Their sum
 | |
| equals @var{value}.  Each of the parts has the same sign as @var{value},
 | |
| and the integer part is always rounded toward zero.
 | |
| 
 | |
| @code{modf} stores the integer part in @code{*@var{integer-part}}, and
 | |
| returns the fractional part.  For example, @code{modf (2.5, &intpart)}
 | |
| returns @code{0.5} and stores @code{2.0} into @code{intpart}.
 | |
| @end deftypefun
 | |
| 
 | |
| @node Remainder Functions
 | |
| @subsection Remainder Functions
 | |
| 
 | |
| The functions in this section compute the remainder on division of two
 | |
| floating-point numbers.  Each is a little different; pick the one that
 | |
| suits your problem.
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefun double fmod (double @var{numerator}, double @var{denominator})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx float fmodf (float @var{numerator}, float @var{denominator})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx {long double} fmodl (long double @var{numerator}, long double @var{denominator})
 | |
| These functions compute the remainder from the division of
 | |
| @var{numerator} by @var{denominator}.  Specifically, the return value is
 | |
| @code{@var{numerator} - @w{@var{n} * @var{denominator}}}, where @var{n}
 | |
| is the quotient of @var{numerator} divided by @var{denominator}, rounded
 | |
| towards zero to an integer.  Thus, @w{@code{fmod (6.5, 2.3)}} returns
 | |
| @code{1.9}, which is @code{6.5} minus @code{4.6}.
 | |
| 
 | |
| The result has the same sign as the @var{numerator} and has magnitude
 | |
| less than the magnitude of the @var{denominator}.
 | |
| 
 | |
| If @var{denominator} is zero, @code{fmod} signals a domain error.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefun double drem (double @var{numerator}, double @var{denominator})
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefunx float dremf (float @var{numerator}, float @var{denominator})
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefunx {long double} dreml (long double @var{numerator}, long double @var{denominator})
 | |
| These functions are like @code{fmod} except that they rounds the
 | |
| internal quotient @var{n} to the nearest integer instead of towards zero
 | |
| to an integer.  For example, @code{drem (6.5, 2.3)} returns @code{-0.4},
 | |
| which is @code{6.5} minus @code{6.9}.
 | |
| 
 | |
| The absolute value of the result is less than or equal to half the
 | |
| absolute value of the @var{denominator}.  The difference between
 | |
| @code{fmod (@var{numerator}, @var{denominator})} and @code{drem
 | |
| (@var{numerator}, @var{denominator})} is always either
 | |
| @var{denominator}, minus @var{denominator}, or zero.
 | |
| 
 | |
| If @var{denominator} is zero, @code{drem} signals a domain error.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefun double remainder (double @var{numerator}, double @var{denominator})
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefunx float remainderf (float @var{numerator}, float @var{denominator})
 | |
| @comment math.h
 | |
| @comment BSD
 | |
| @deftypefunx {long double} remainderl (long double @var{numerator}, long double @var{denominator})
 | |
| This function is another name for @code{drem}.
 | |
| @end deftypefun
 | |
| 
 | |
| @node FP Bit Twiddling
 | |
| @subsection Setting and modifying single bits of FP values
 | |
| @cindex FP arithmetic
 | |
| 
 | |
| There are some operations that are too complicated or expensive to
 | |
| perform by hand on floating-point numbers.  @w{ISO C99} defines
 | |
| functions to do these operations, which mostly involve changing single
 | |
| bits.
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefun double copysign (double @var{x}, double @var{y})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx float copysignf (float @var{x}, float @var{y})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx {long double} copysignl (long double @var{x}, long double @var{y})
 | |
| These functions return @var{x} but with the sign of @var{y}.  They work
 | |
| even if @var{x} or @var{y} are NaN or zero.  Both of these can carry a
 | |
| sign (although not all implementations support it) and this is one of
 | |
| the few operations that can tell the difference.
 | |
| 
 | |
| @code{copysign} never raises an exception.
 | |
| @c except signalling NaNs
 | |
| 
 | |
| This function is defined in @w{IEC 559} (and the appendix with
 | |
| recommended functions in @w{IEEE 754}/@w{IEEE 854}).
 | |
| @end deftypefun
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefun int signbit (@emph{float-type} @var{x})
 | |
| @code{signbit} is a generic macro which can work on all floating-point
 | |
| types.  It returns a nonzero value if the value of @var{x} has its sign
 | |
| bit set.
 | |
| 
 | |
| This is not the same as @code{x < 0.0}, because @w{IEEE 754} floating
 | |
| point allows zero to be signed.  The comparison @code{-0.0 < 0.0} is
 | |
| false, but @code{signbit (-0.0)} will return a nonzero value.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefun double nextafter (double @var{x}, double @var{y})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx float nextafterf (float @var{x}, float @var{y})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx {long double} nextafterl (long double @var{x}, long double @var{y})
 | |
| The @code{nextafter} function returns the next representable neighbor of
 | |
| @var{x} in the direction towards @var{y}.  The size of the step between
 | |
| @var{x} and the result depends on the type of the result.  If
 | |
| @math{@var{x} = @var{y}} the function simply returns @var{x}.  If either
 | |
| value is @code{NaN}, @code{NaN} is returned.  Otherwise
 | |
| a value corresponding to the value of the least significant bit in the
 | |
| mantissa is added or subtracted, depending on the direction.
 | |
| @code{nextafter} will signal overflow or underflow if the result goes
 | |
| outside of the range of normalized numbers.
 | |
| 
 | |
| This function is defined in @w{IEC 559} (and the appendix with
 | |
| recommended functions in @w{IEEE 754}/@w{IEEE 854}).
 | |
| @end deftypefun
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefun double nexttoward (double @var{x}, long double @var{y})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx float nexttowardf (float @var{x}, long double @var{y})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx {long double} nexttowardl (long double @var{x}, long double @var{y})
 | |
| These functions are identical to the corresponding versions of
 | |
| @code{nextafter} except that their second argument is a @code{long
 | |
| double}.
 | |
| @end deftypefun
 | |
| 
 | |
| @cindex NaN
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefun double nan (const char *@var{tagp})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx float nanf (const char *@var{tagp})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx {long double} nanl (const char *@var{tagp})
 | |
| The @code{nan} function returns a representation of NaN, provided that
 | |
| NaN is supported by the target platform.
 | |
| @code{nan ("@var{n-char-sequence}")} is equivalent to
 | |
| @code{strtod ("NAN(@var{n-char-sequence})")}.
 | |
| 
 | |
| The argument @var{tagp} is used in an unspecified manner.  On @w{IEEE
 | |
| 754} systems, there are many representations of NaN, and @var{tagp}
 | |
| selects one.  On other systems it may do nothing.
 | |
| @end deftypefun
 | |
| 
 | |
| @node FP Comparison Functions
 | |
| @subsection Floating-Point Comparison Functions
 | |
| @cindex unordered comparison
 | |
| 
 | |
| The standard C comparison operators provoke exceptions when one or other
 | |
| of the operands is NaN.  For example,
 | |
| 
 | |
| @smallexample
 | |
| int v = a < 1.0;
 | |
| @end smallexample
 | |
| 
 | |
| @noindent
 | |
| will raise an exception if @var{a} is NaN.  (This does @emph{not}
 | |
| happen with @code{==} and @code{!=}; those merely return false and true,
 | |
| respectively, when NaN is examined.)  Frequently this exception is
 | |
| undesirable.  @w{ISO C99} therefore defines comparison functions that
 | |
| do not raise exceptions when NaN is examined.  All of the functions are
 | |
| implemented as macros which allow their arguments to be of any
 | |
| floating-point type.  The macros are guaranteed to evaluate their
 | |
| arguments only once.
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefn Macro int isgreater (@emph{real-floating} @var{x}, @emph{real-floating} @var{y})
 | |
| This macro determines whether the argument @var{x} is greater than
 | |
| @var{y}.  It is equivalent to @code{(@var{x}) > (@var{y})}, but no
 | |
| exception is raised if @var{x} or @var{y} are NaN.
 | |
| @end deftypefn
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefn Macro int isgreaterequal (@emph{real-floating} @var{x}, @emph{real-floating} @var{y})
 | |
| This macro determines whether the argument @var{x} is greater than or
 | |
| equal to @var{y}.  It is equivalent to @code{(@var{x}) >= (@var{y})}, but no
 | |
| exception is raised if @var{x} or @var{y} are NaN.
 | |
| @end deftypefn
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefn Macro int isless (@emph{real-floating} @var{x}, @emph{real-floating} @var{y})
 | |
| This macro determines whether the argument @var{x} is less than @var{y}.
 | |
| It is equivalent to @code{(@var{x}) < (@var{y})}, but no exception is
 | |
| raised if @var{x} or @var{y} are NaN.
 | |
| @end deftypefn
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefn Macro int islessequal (@emph{real-floating} @var{x}, @emph{real-floating} @var{y})
 | |
| This macro determines whether the argument @var{x} is less than or equal
 | |
| to @var{y}.  It is equivalent to @code{(@var{x}) <= (@var{y})}, but no
 | |
| exception is raised if @var{x} or @var{y} are NaN.
 | |
| @end deftypefn
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefn Macro int islessgreater (@emph{real-floating} @var{x}, @emph{real-floating} @var{y})
 | |
| This macro determines whether the argument @var{x} is less or greater
 | |
| than @var{y}.  It is equivalent to @code{(@var{x}) < (@var{y}) ||
 | |
| (@var{x}) > (@var{y})} (although it only evaluates @var{x} and @var{y}
 | |
| once), but no exception is raised if @var{x} or @var{y} are NaN.
 | |
| 
 | |
| This macro is not equivalent to @code{@var{x} != @var{y}}, because that
 | |
| expression is true if @var{x} or @var{y} are NaN.
 | |
| @end deftypefn
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefn Macro int isunordered (@emph{real-floating} @var{x}, @emph{real-floating} @var{y})
 | |
| This macro determines whether its arguments are unordered.  In other
 | |
| words, it is true if @var{x} or @var{y} are NaN, and false otherwise.
 | |
| @end deftypefn
 | |
| 
 | |
| Not all machines provide hardware support for these operations.  On
 | |
| machines that don't, the macros can be very slow.  Therefore, you should
 | |
| not use these functions when NaN is not a concern.
 | |
| 
 | |
| @strong{Note:} There are no macros @code{isequal} or @code{isunequal}.
 | |
| They are unnecessary, because the @code{==} and @code{!=} operators do
 | |
| @emph{not} throw an exception if one or both of the operands are NaN.
 | |
| 
 | |
| @node Misc FP Arithmetic
 | |
| @subsection Miscellaneous FP arithmetic functions
 | |
| @cindex minimum
 | |
| @cindex maximum
 | |
| @cindex positive difference
 | |
| @cindex multiply-add
 | |
| 
 | |
| The functions in this section perform miscellaneous but common
 | |
| operations that are awkward to express with C operators.  On some
 | |
| processors these functions can use special machine instructions to
 | |
| perform these operations faster than the equivalent C code.
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefun double fmin (double @var{x}, double @var{y})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx float fminf (float @var{x}, float @var{y})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx {long double} fminl (long double @var{x}, long double @var{y})
 | |
| The @code{fmin} function returns the lesser of the two values @var{x}
 | |
| and @var{y}.  It is similar to the expression
 | |
| @smallexample
 | |
| ((x) < (y) ? (x) : (y))
 | |
| @end smallexample
 | |
| except that @var{x} and @var{y} are only evaluated once.
 | |
| 
 | |
| If an argument is NaN, the other argument is returned.  If both arguments
 | |
| are NaN, NaN is returned.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefun double fmax (double @var{x}, double @var{y})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx float fmaxf (float @var{x}, float @var{y})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx {long double} fmaxl (long double @var{x}, long double @var{y})
 | |
| The @code{fmax} function returns the greater of the two values @var{x}
 | |
| and @var{y}.
 | |
| 
 | |
| If an argument is NaN, the other argument is returned.  If both arguments
 | |
| are NaN, NaN is returned.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefun double fdim (double @var{x}, double @var{y})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx float fdimf (float @var{x}, float @var{y})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx {long double} fdiml (long double @var{x}, long double @var{y})
 | |
| The @code{fdim} function returns the positive difference between
 | |
| @var{x} and @var{y}.  The positive difference is @math{@var{x} -
 | |
| @var{y}} if @var{x} is greater than @var{y}, and @math{0} otherwise.
 | |
| 
 | |
| If @var{x}, @var{y}, or both are NaN, NaN is returned.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefun double fma (double @var{x}, double @var{y}, double @var{z})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx float fmaf (float @var{x}, float @var{y}, float @var{z})
 | |
| @comment math.h
 | |
| @comment ISO
 | |
| @deftypefunx {long double} fmal (long double @var{x}, long double @var{y}, long double @var{z})
 | |
| @cindex butterfly
 | |
| The @code{fma} function performs floating-point multiply-add.  This is
 | |
| the operation @math{(@var{x} @mul{} @var{y}) + @var{z}}, but the
 | |
| intermediate result is not rounded to the destination type.  This can
 | |
| sometimes improve the precision of a calculation.
 | |
| 
 | |
| This function was introduced because some processors have a special
 | |
| instruction to perform multiply-add.  The C compiler cannot use it
 | |
| directly, because the expression @samp{x*y + z} is defined to round the
 | |
| intermediate result.  @code{fma} lets you choose when you want to round
 | |
| only once.
 | |
| 
 | |
| @vindex FP_FAST_FMA
 | |
| On processors which do not implement multiply-add in hardware,
 | |
| @code{fma} can be very slow since it must avoid intermediate rounding.
 | |
| @file{math.h} defines the symbols @code{FP_FAST_FMA},
 | |
| @code{FP_FAST_FMAF}, and @code{FP_FAST_FMAL} when the corresponding
 | |
| version of @code{fma} is no slower than the expression @samp{x*y + z}.
 | |
| In the GNU C library, this always means the operation is implemented in
 | |
| hardware.
 | |
| @end deftypefun
 | |
| 
 | |
| @node Complex Numbers
 | |
| @section Complex Numbers
 | |
| @pindex complex.h
 | |
| @cindex complex numbers
 | |
| 
 | |
| @w{ISO C99} introduces support for complex numbers in C.  This is done
 | |
| with a new type qualifier, @code{complex}.  It is a keyword if and only
 | |
| if @file{complex.h} has been included.  There are three complex types,
 | |
| corresponding to the three real types:  @code{float complex},
 | |
| @code{double complex}, and @code{long double complex}.
 | |
| 
 | |
| To construct complex numbers you need a way to indicate the imaginary
 | |
| part of a number.  There is no standard notation for an imaginary
 | |
| floating point constant.  Instead, @file{complex.h} defines two macros
 | |
| that can be used to create complex numbers.
 | |
| 
 | |
| @deftypevr Macro {const float complex} _Complex_I
 | |
| This macro is a representation of the complex number ``@math{0+1i}''.
 | |
| Multiplying a real floating-point value by @code{_Complex_I} gives a
 | |
| complex number whose value is purely imaginary.  You can use this to
 | |
| construct complex constants:
 | |
| 
 | |
| @smallexample
 | |
| @math{3.0 + 4.0i} = @code{3.0 + 4.0 * _Complex_I}
 | |
| @end smallexample
 | |
| 
 | |
| Note that @code{_Complex_I * _Complex_I} has the value @code{-1}, but
 | |
| the type of that value is @code{complex}.
 | |
| @end deftypevr
 | |
| 
 | |
| @c Put this back in when gcc supports _Imaginary_I.  It's too confusing.
 | |
| @ignore
 | |
| @noindent
 | |
| Without an optimizing compiler this is more expensive than the use of
 | |
| @code{_Imaginary_I} but with is better than nothing.  You can avoid all
 | |
| the hassles if you use the @code{I} macro below if the name is not
 | |
| problem.
 | |
| 
 | |
| @deftypevr Macro {const float imaginary} _Imaginary_I
 | |
| This macro is a representation of the value ``@math{1i}''.  I.e., it is
 | |
| the value for which
 | |
| 
 | |
| @smallexample
 | |
| _Imaginary_I * _Imaginary_I = -1
 | |
| @end smallexample
 | |
| 
 | |
| @noindent
 | |
| The result is not of type @code{float imaginary} but instead @code{float}.
 | |
| One can use it to easily construct complex number like in
 | |
| 
 | |
| @smallexample
 | |
| 3.0 - _Imaginary_I * 4.0
 | |
| @end smallexample
 | |
| 
 | |
| @noindent
 | |
| which results in the complex number with a real part of 3.0 and a
 | |
| imaginary part -4.0.
 | |
| @end deftypevr
 | |
| @end ignore
 | |
| 
 | |
| @noindent
 | |
| @code{_Complex_I} is a bit of a mouthful.  @file{complex.h} also defines
 | |
| a shorter name for the same constant.
 | |
| 
 | |
| @deftypevr Macro {const float complex} I
 | |
| This macro has exactly the same value as @code{_Complex_I}.  Most of the
 | |
| time it is preferable.  However, it causes problems if you want to use
 | |
| the identifier @code{I} for something else.  You can safely write
 | |
| 
 | |
| @smallexample
 | |
| #include <complex.h>
 | |
| #undef I
 | |
| @end smallexample
 | |
| 
 | |
| @noindent
 | |
| if you need @code{I} for your own purposes.  (In that case we recommend
 | |
| you also define some other short name for @code{_Complex_I}, such as
 | |
| @code{J}.)
 | |
| 
 | |
| @ignore
 | |
| If the implementation does not support the @code{imaginary} types
 | |
| @code{I} is defined as @code{_Complex_I} which is the second best
 | |
| solution.  It still can be used in the same way but requires a most
 | |
| clever compiler to get the same results.
 | |
| @end ignore
 | |
| @end deftypevr
 | |
| 
 | |
| @node Operations on Complex
 | |
| @section Projections, Conjugates, and Decomposing of Complex Numbers
 | |
| @cindex project complex numbers
 | |
| @cindex conjugate complex numbers
 | |
| @cindex decompose complex numbers
 | |
| @pindex complex.h
 | |
| 
 | |
| @w{ISO C99} also defines functions that perform basic operations on
 | |
| complex numbers, such as decomposition and conjugation.  The prototypes
 | |
| for all these functions are in @file{complex.h}.  All functions are
 | |
| available in three variants, one for each of the three complex types.
 | |
| 
 | |
| @comment complex.h
 | |
| @comment ISO
 | |
| @deftypefun double creal (complex double @var{z})
 | |
| @comment complex.h
 | |
| @comment ISO
 | |
| @deftypefunx float crealf (complex float @var{z})
 | |
| @comment complex.h
 | |
| @comment ISO
 | |
| @deftypefunx {long double} creall (complex long double @var{z})
 | |
| These functions return the real part of the complex number @var{z}.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment complex.h
 | |
| @comment ISO
 | |
| @deftypefun double cimag (complex double @var{z})
 | |
| @comment complex.h
 | |
| @comment ISO
 | |
| @deftypefunx float cimagf (complex float @var{z})
 | |
| @comment complex.h
 | |
| @comment ISO
 | |
| @deftypefunx {long double} cimagl (complex long double @var{z})
 | |
| These functions return the imaginary part of the complex number @var{z}.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment complex.h
 | |
| @comment ISO
 | |
| @deftypefun {complex double} conj (complex double @var{z})
 | |
| @comment complex.h
 | |
| @comment ISO
 | |
| @deftypefunx {complex float} conjf (complex float @var{z})
 | |
| @comment complex.h
 | |
| @comment ISO
 | |
| @deftypefunx {complex long double} conjl (complex long double @var{z})
 | |
| These functions return the conjugate value of the complex number
 | |
| @var{z}.  The conjugate of a complex number has the same real part and a
 | |
| negated imaginary part.  In other words, @samp{conj(a + bi) = a + -bi}.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment complex.h
 | |
| @comment ISO
 | |
| @deftypefun double carg (complex double @var{z})
 | |
| @comment complex.h
 | |
| @comment ISO
 | |
| @deftypefunx float cargf (complex float @var{z})
 | |
| @comment complex.h
 | |
| @comment ISO
 | |
| @deftypefunx {long double} cargl (complex long double @var{z})
 | |
| These functions return the argument of the complex number @var{z}.
 | |
| The argument of a complex number is the angle in the complex plane
 | |
| between the positive real axis and a line passing through zero and the
 | |
| number.  This angle is measured in the usual fashion and ranges from @math{0}
 | |
| to @math{2@pi{}}.
 | |
| 
 | |
| @code{carg} has a branch cut along the positive real axis.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment complex.h
 | |
| @comment ISO
 | |
| @deftypefun {complex double} cproj (complex double @var{z})
 | |
| @comment complex.h
 | |
| @comment ISO
 | |
| @deftypefunx {complex float} cprojf (complex float @var{z})
 | |
| @comment complex.h
 | |
| @comment ISO
 | |
| @deftypefunx {complex long double} cprojl (complex long double @var{z})
 | |
| These functions return the projection of the complex value @var{z} onto
 | |
| the Riemann sphere.  Values with a infinite imaginary part are projected
 | |
| to positive infinity on the real axis, even if the real part is NaN.  If
 | |
| the real part is infinite, the result is equivalent to
 | |
| 
 | |
| @smallexample
 | |
| INFINITY + I * copysign (0.0, cimag (z))
 | |
| @end smallexample
 | |
| @end deftypefun
 | |
| 
 | |
| @node Parsing of Numbers
 | |
| @section Parsing of Numbers
 | |
| @cindex parsing numbers (in formatted input)
 | |
| @cindex converting strings to numbers
 | |
| @cindex number syntax, parsing
 | |
| @cindex syntax, for reading numbers
 | |
| 
 | |
| This section describes functions for ``reading'' integer and
 | |
| floating-point numbers from a string.  It may be more convenient in some
 | |
| cases to use @code{sscanf} or one of the related functions; see
 | |
| @ref{Formatted Input}.  But often you can make a program more robust by
 | |
| finding the tokens in the string by hand, then converting the numbers
 | |
| one by one.
 | |
| 
 | |
| @menu
 | |
| * Parsing of Integers::         Functions for conversion of integer values.
 | |
| * Parsing of Floats::           Functions for conversion of floating-point
 | |
| 				 values.
 | |
| @end menu
 | |
| 
 | |
| @node Parsing of Integers
 | |
| @subsection Parsing of Integers
 | |
| 
 | |
| @pindex stdlib.h
 | |
| These functions are declared in @file{stdlib.h}.
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment ISO
 | |
| @deftypefun {long int} strtol (const char *@var{string}, char **@var{tailptr}, int @var{base})
 | |
| The @code{strtol} (``string-to-long'') function converts the initial
 | |
| part of @var{string} to a signed integer, which is returned as a value
 | |
| of type @code{long int}.
 | |
| 
 | |
| This function attempts to decompose @var{string} as follows:
 | |
| 
 | |
| @itemize @bullet
 | |
| @item
 | |
| A (possibly empty) sequence of whitespace characters.  Which characters
 | |
| are whitespace is determined by the @code{isspace} function
 | |
| (@pxref{Classification of Characters}).  These are discarded.
 | |
| 
 | |
| @item
 | |
| An optional plus or minus sign (@samp{+} or @samp{-}).
 | |
| 
 | |
| @item
 | |
| A nonempty sequence of digits in the radix specified by @var{base}.
 | |
| 
 | |
| If @var{base} is zero, decimal radix is assumed unless the series of
 | |
| digits begins with @samp{0} (specifying octal radix), or @samp{0x} or
 | |
| @samp{0X} (specifying hexadecimal radix); in other words, the same
 | |
| syntax used for integer constants in C.
 | |
| 
 | |
| Otherwise @var{base} must have a value between @code{2} and @code{36}.
 | |
| If @var{base} is @code{16}, the digits may optionally be preceded by
 | |
| @samp{0x} or @samp{0X}.  If base has no legal value the value returned
 | |
| is @code{0l} and the global variable @code{errno} is set to @code{EINVAL}.
 | |
| 
 | |
| @item
 | |
| Any remaining characters in the string.  If @var{tailptr} is not a null
 | |
| pointer, @code{strtol} stores a pointer to this tail in
 | |
| @code{*@var{tailptr}}.
 | |
| @end itemize
 | |
| 
 | |
| If the string is empty, contains only whitespace, or does not contain an
 | |
| initial substring that has the expected syntax for an integer in the
 | |
| specified @var{base}, no conversion is performed.  In this case,
 | |
| @code{strtol} returns a value of zero and the value stored in
 | |
| @code{*@var{tailptr}} is the value of @var{string}.
 | |
| 
 | |
| In a locale other than the standard @code{"C"} locale, this function
 | |
| may recognize additional implementation-dependent syntax.
 | |
| 
 | |
| If the string has valid syntax for an integer but the value is not
 | |
| representable because of overflow, @code{strtol} returns either
 | |
| @code{LONG_MAX} or @code{LONG_MIN} (@pxref{Range of Type}), as
 | |
| appropriate for the sign of the value.  It also sets @code{errno}
 | |
| to @code{ERANGE} to indicate there was overflow.
 | |
| 
 | |
| You should not check for errors by examining the return value of
 | |
| @code{strtol}, because the string might be a valid representation of
 | |
| @code{0l}, @code{LONG_MAX}, or @code{LONG_MIN}.  Instead, check whether
 | |
| @var{tailptr} points to what you expect after the number
 | |
| (e.g. @code{'\0'} if the string should end after the number).  You also
 | |
| need to clear @var{errno} before the call and check it afterward, in
 | |
| case there was overflow.
 | |
| 
 | |
| There is an example at the end of this section.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment ISO
 | |
| @deftypefun {unsigned long int} strtoul (const char *@var{string}, char **@var{tailptr}, int @var{base})
 | |
| The @code{strtoul} (``string-to-unsigned-long'') function is like
 | |
| @code{strtol} except it converts to an @code{unsigned long int} value.
 | |
| The syntax is the same as described above for @code{strtol}.  The value
 | |
| returned on overflow is @code{ULONG_MAX} (@pxref{Range of Type}).
 | |
| 
 | |
| If @var{string} depicts a negative number, @code{strtoul} acts the same
 | |
| as @var{strtol} but casts the result to an unsigned integer.  That means
 | |
| for example that @code{strtoul} on @code{"-1"} returns @code{ULONG_MAX}
 | |
| and an input more negative than @code{LONG_MIN} returns
 | |
| (@code{ULONG_MAX} + 1) / 2.
 | |
| 
 | |
| @code{strtoul} sets @var{errno} to @code{EINVAL} if @var{base} is out of
 | |
| range, or @code{ERANGE} on overflow.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment ISO
 | |
| @deftypefun {long long int} strtoll (const char *@var{string}, char **@var{tailptr}, int @var{base})
 | |
| The @code{strtoll} function is like @code{strtol} except that it returns
 | |
| a @code{long long int} value, and accepts numbers with a correspondingly
 | |
| larger range.
 | |
| 
 | |
| If the string has valid syntax for an integer but the value is not
 | |
| representable because of overflow, @code{strtoll} returns either
 | |
| @code{LONG_LONG_MAX} or @code{LONG_LONG_MIN} (@pxref{Range of Type}), as
 | |
| appropriate for the sign of the value.  It also sets @code{errno} to
 | |
| @code{ERANGE} to indicate there was overflow.
 | |
| 
 | |
| The @code{strtoll} function was introduced in @w{ISO C99}.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment BSD
 | |
| @deftypefun {long long int} strtoq (const char *@var{string}, char **@var{tailptr}, int @var{base})
 | |
| @code{strtoq} (``string-to-quad-word'') is the BSD name for @code{strtoll}.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment ISO
 | |
| @deftypefun {unsigned long long int} strtoull (const char *@var{string}, char **@var{tailptr}, int @var{base})
 | |
| The @code{strtoull} function is related to @code{strtoll} the same way
 | |
| @code{strtoul} is related to @code{strtol}.
 | |
| 
 | |
| The @code{strtoull} function was introduced in @w{ISO C99}.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment BSD
 | |
| @deftypefun {unsigned long long int} strtouq (const char *@var{string}, char **@var{tailptr}, int @var{base})
 | |
| @code{strtouq} is the BSD name for @code{strtoull}.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment inttypes.h
 | |
| @comment ???
 | |
| @deftypefun {long long int} strtoimax (const char *@var{string}, char **@var{tailptr}, int @var{base})
 | |
| The @code{strtoimax} function is like @code{strtol} except that it returns
 | |
| a @code{intmax_t} value, and accepts numbers of a corresponding range.
 | |
| 
 | |
| If the string has valid syntax for an integer but the value is not
 | |
| representable because of overflow, @code{strtoimax} returns either
 | |
| @code{INTMAX_MAX} or @code{INTMAX_MIN} (@pxref{Integers}), as
 | |
| appropriate for the sign of the value.  It also sets @code{errno} to
 | |
| @code{ERANGE} to indicate there was overflow.
 | |
| 
 | |
| The symbols for @code{strtoimax} are declared in @file{inttypes.h}.
 | |
| 
 | |
| See @ref{Integers} for a description of the @code{intmax_t} type.
 | |
| 
 | |
| @end deftypefun
 | |
| 
 | |
| @comment inttypes.h
 | |
| @comment ???
 | |
| @deftypefun uintmax_t strtoumax (const char *@var{string}, char **@var{tailptr}, int @var{base})
 | |
| The @code{strtoumax} function is related to @code{strtoimax}
 | |
| the same way that @code{strtoul} is related to @code{strtol}.
 | |
| 
 | |
| The symbols for @code{strtoimax} are declared in @file{inttypes.h}.
 | |
| 
 | |
| See @ref{Integers} for a description of the @code{intmax_t} type.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment ISO
 | |
| @deftypefun {long int} atol (const char *@var{string})
 | |
| This function is similar to the @code{strtol} function with a @var{base}
 | |
| argument of @code{10}, except that it need not detect overflow errors.
 | |
| The @code{atol} function is provided mostly for compatibility with
 | |
| existing code; using @code{strtol} is more robust.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment ISO
 | |
| @deftypefun int atoi (const char *@var{string})
 | |
| This function is like @code{atol}, except that it returns an @code{int}.
 | |
| The @code{atoi} function is also considered obsolete; use @code{strtol}
 | |
| instead.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment ISO
 | |
| @deftypefun {long long int} atoll (const char *@var{string})
 | |
| This function is similar to @code{atol}, except it returns a @code{long
 | |
| long int}.
 | |
| 
 | |
| The @code{atoll} function was introduced in @w{ISO C99}.  It too is
 | |
| obsolete (despite having just been added); use @code{strtoll} instead.
 | |
| @end deftypefun
 | |
| 
 | |
| @c !!! please fact check this paragraph -zw
 | |
| @findex strtol_l
 | |
| @findex strtoul_l
 | |
| @findex strtoll_l
 | |
| @findex strtoull_l
 | |
| @cindex parsing numbers and locales
 | |
| @cindex locales, parsing numbers and
 | |
| Some locales specify a printed syntax for numbers other than the one
 | |
| that these functions understand.  If you need to read numbers formatted
 | |
| in some other locale, you can use the @code{strtoX_l} functions.  Each
 | |
| of the @code{strtoX} functions has a counterpart with @samp{_l} added to
 | |
| its name.  The @samp{_l} counterparts take an additional argument: a
 | |
| pointer to an @code{locale_t} structure, which describes how the numbers
 | |
| to be read are formatted.  @xref{Locales}.
 | |
| 
 | |
| @strong{Portability Note:} These functions are all GNU extensions.  You
 | |
| can also use @code{scanf} or its relatives, which have the @samp{'} flag
 | |
| for parsing numeric input according to the current locale
 | |
| (@pxref{Numeric Input Conversions}).  This feature is standard.
 | |
| 
 | |
| Here is a function which parses a string as a sequence of integers and
 | |
| returns the sum of them:
 | |
| 
 | |
| @smallexample
 | |
| int
 | |
| sum_ints_from_string (char *string)
 | |
| @{
 | |
|   int sum = 0;
 | |
| 
 | |
|   while (1) @{
 | |
|     char *tail;
 | |
|     int next;
 | |
| 
 | |
|     /* @r{Skip whitespace by hand, to detect the end.}  */
 | |
|     while (isspace (*string)) string++;
 | |
|     if (*string == 0)
 | |
|       break;
 | |
| 
 | |
|     /* @r{There is more nonwhitespace,}  */
 | |
|     /* @r{so it ought to be another number.}  */
 | |
|     errno = 0;
 | |
|     /* @r{Parse it.}  */
 | |
|     next = strtol (string, &tail, 0);
 | |
|     /* @r{Add it in, if not overflow.}  */
 | |
|     if (errno)
 | |
|       printf ("Overflow\n");
 | |
|     else
 | |
|       sum += next;
 | |
|     /* @r{Advance past it.}  */
 | |
|     string = tail;
 | |
|   @}
 | |
| 
 | |
|   return sum;
 | |
| @}
 | |
| @end smallexample
 | |
| 
 | |
| @node Parsing of Floats
 | |
| @subsection Parsing of Floats
 | |
| 
 | |
| @pindex stdlib.h
 | |
| These functions are declared in @file{stdlib.h}.
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment ISO
 | |
| @deftypefun double strtod (const char *@var{string}, char **@var{tailptr})
 | |
| The @code{strtod} (``string-to-double'') function converts the initial
 | |
| part of @var{string} to a floating-point number, which is returned as a
 | |
| value of type @code{double}.
 | |
| 
 | |
| This function attempts to decompose @var{string} as follows:
 | |
| 
 | |
| @itemize @bullet
 | |
| @item
 | |
| A (possibly empty) sequence of whitespace characters.  Which characters
 | |
| are whitespace is determined by the @code{isspace} function
 | |
| (@pxref{Classification of Characters}).  These are discarded.
 | |
| 
 | |
| @item
 | |
| An optional plus or minus sign (@samp{+} or @samp{-}).
 | |
| 
 | |
| @item A floating point number in decimal or hexadecimal format.  The
 | |
| decimal format is:
 | |
| @itemize @minus
 | |
| 
 | |
| @item
 | |
| A nonempty sequence of digits optionally containing a decimal-point
 | |
| character---normally @samp{.}, but it depends on the locale
 | |
| (@pxref{General Numeric}).
 | |
| 
 | |
| @item
 | |
| An optional exponent part, consisting of a character @samp{e} or
 | |
| @samp{E}, an optional sign, and a sequence of digits.
 | |
| 
 | |
| @end itemize
 | |
| 
 | |
| The hexadecimal format is as follows:
 | |
| @itemize @minus
 | |
| 
 | |
| @item
 | |
| A 0x or 0X followed by a nonempty sequence of hexadecimal digits
 | |
| optionally containing a decimal-point character---normally @samp{.}, but
 | |
| it depends on the locale (@pxref{General Numeric}).
 | |
| 
 | |
| @item
 | |
| An optional binary-exponent part, consisting of a character @samp{p} or
 | |
| @samp{P}, an optional sign, and a sequence of digits.
 | |
| 
 | |
| @end itemize
 | |
| 
 | |
| @item
 | |
| Any remaining characters in the string.  If @var{tailptr} is not a null
 | |
| pointer, a pointer to this tail of the string is stored in
 | |
| @code{*@var{tailptr}}.
 | |
| @end itemize
 | |
| 
 | |
| If the string is empty, contains only whitespace, or does not contain an
 | |
| initial substring that has the expected syntax for a floating-point
 | |
| number, no conversion is performed.  In this case, @code{strtod} returns
 | |
| a value of zero and the value returned in @code{*@var{tailptr}} is the
 | |
| value of @var{string}.
 | |
| 
 | |
| In a locale other than the standard @code{"C"} or @code{"POSIX"} locales,
 | |
| this function may recognize additional locale-dependent syntax.
 | |
| 
 | |
| If the string has valid syntax for a floating-point number but the value
 | |
| is outside the range of a @code{double}, @code{strtod} will signal
 | |
| overflow or underflow as described in @ref{Math Error Reporting}.
 | |
| 
 | |
| @code{strtod} recognizes four special input strings.  The strings
 | |
| @code{"inf"} and @code{"infinity"} are converted to @math{@infinity{}},
 | |
| or to the largest representable value if the floating-point format
 | |
| doesn't support infinities.  You can prepend a @code{"+"} or @code{"-"}
 | |
| to specify the sign.  Case is ignored when scanning these strings.
 | |
| 
 | |
| The strings @code{"nan"} and @code{"nan(@var{chars...})"} are converted
 | |
| to NaN.  Again, case is ignored.  If @var{chars...} are provided, they
 | |
| are used in some unspecified fashion to select a particular
 | |
| representation of NaN (there can be several).
 | |
| 
 | |
| Since zero is a valid result as well as the value returned on error, you
 | |
| should check for errors in the same way as for @code{strtol}, by
 | |
| examining @var{errno} and @var{tailptr}.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment ISO
 | |
| @deftypefun float strtof (const char *@var{string}, char **@var{tailptr})
 | |
| @comment stdlib.h
 | |
| @comment ISO
 | |
| @deftypefunx {long double} strtold (const char *@var{string}, char **@var{tailptr})
 | |
| These functions are analogous to @code{strtod}, but return @code{float}
 | |
| and @code{long double} values respectively.  They report errors in the
 | |
| same way as @code{strtod}.  @code{strtof} can be substantially faster
 | |
| than @code{strtod}, but has less precision; conversely, @code{strtold}
 | |
| can be much slower but has more precision (on systems where @code{long
 | |
| double} is a separate type).
 | |
| 
 | |
| These functions have been GNU extensions and are new to @w{ISO C99}.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment ISO
 | |
| @deftypefun double atof (const char *@var{string})
 | |
| This function is similar to the @code{strtod} function, except that it
 | |
| need not detect overflow and underflow errors.  The @code{atof} function
 | |
| is provided mostly for compatibility with existing code; using
 | |
| @code{strtod} is more robust.
 | |
| @end deftypefun
 | |
| 
 | |
| The GNU C library also provides @samp{_l} versions of these functions,
 | |
| which take an additional argument, the locale to use in conversion.
 | |
| @xref{Parsing of Integers}.
 | |
| 
 | |
| @node System V Number Conversion
 | |
| @section Old-fashioned System V number-to-string functions
 | |
| 
 | |
| The old @w{System V} C library provided three functions to convert
 | |
| numbers to strings, with unusual and hard-to-use semantics.  The GNU C
 | |
| library also provides these functions and some natural extensions.
 | |
| 
 | |
| These functions are only available in glibc and on systems descended
 | |
| from AT&T Unix.  Therefore, unless these functions do precisely what you
 | |
| need, it is better to use @code{sprintf}, which is standard.
 | |
| 
 | |
| All these functions are defined in @file{stdlib.h}.
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment SVID, Unix98
 | |
| @deftypefun {char *} ecvt (double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg})
 | |
| The function @code{ecvt} converts the floating-point number @var{value}
 | |
| to a string with at most @var{ndigit} decimal digits.  The
 | |
| returned string contains no decimal point or sign. The first digit of
 | |
| the string is non-zero (unless @var{value} is actually zero) and the
 | |
| last digit is rounded to nearest.  @code{*@var{decpt}} is set to the
 | |
| index in the string of the first digit after the decimal point.
 | |
| @code{*@var{neg}} is set to a nonzero value if @var{value} is negative,
 | |
| zero otherwise.
 | |
| 
 | |
| If @var{ndigit} decimal digits would exceed the precision of a
 | |
| @code{double} it is reduced to a system-specific value.
 | |
| 
 | |
| The returned string is statically allocated and overwritten by each call
 | |
| to @code{ecvt}.
 | |
| 
 | |
| If @var{value} is zero, it is implementation defined whether
 | |
| @code{*@var{decpt}} is @code{0} or @code{1}.
 | |
| 
 | |
| For example: @code{ecvt (12.3, 5, &d, &n)} returns @code{"12300"}
 | |
| and sets @var{d} to @code{2} and @var{n} to @code{0}.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment SVID, Unix98
 | |
| @deftypefun {char *} fcvt (double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg})
 | |
| The function @code{fcvt} is like @code{ecvt}, but @var{ndigit} specifies
 | |
| the number of digits after the decimal point.  If @var{ndigit} is less
 | |
| than zero, @var{value} is rounded to the @math{@var{ndigit}+1}'th place to the
 | |
| left of the decimal point.  For example, if @var{ndigit} is @code{-1},
 | |
| @var{value} will be rounded to the nearest 10.  If @var{ndigit} is
 | |
| negative and larger than the number of digits to the left of the decimal
 | |
| point in @var{value}, @var{value} will be rounded to one significant digit.
 | |
| 
 | |
| If @var{ndigit} decimal digits would exceed the precision of a
 | |
| @code{double} it is reduced to a system-specific value.
 | |
| 
 | |
| The returned string is statically allocated and overwritten by each call
 | |
| to @code{fcvt}.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment SVID, Unix98
 | |
| @deftypefun {char *} gcvt (double @var{value}, int @var{ndigit}, char *@var{buf})
 | |
| @code{gcvt} is functionally equivalent to @samp{sprintf(buf, "%*g",
 | |
| ndigit, value}.  It is provided only for compatibility's sake.  It
 | |
| returns @var{buf}.
 | |
| 
 | |
| If @var{ndigit} decimal digits would exceed the precision of a
 | |
| @code{double} it is reduced to a system-specific value.
 | |
| @end deftypefun
 | |
| 
 | |
| As extensions, the GNU C library provides versions of these three
 | |
| functions that take @code{long double} arguments.
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment GNU
 | |
| @deftypefun {char *} qecvt (long double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg})
 | |
| This function is equivalent to @code{ecvt} except that it takes a
 | |
| @code{long double} for the first parameter and that @var{ndigit} is
 | |
| restricted by the precision of a @code{long double}.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment GNU
 | |
| @deftypefun {char *} qfcvt (long double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg})
 | |
| This function is equivalent to @code{fcvt} except that it
 | |
| takes a @code{long double} for the first parameter and that @var{ndigit} is
 | |
| restricted by the precision of a @code{long double}.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment GNU
 | |
| @deftypefun {char *} qgcvt (long double @var{value}, int @var{ndigit}, char *@var{buf})
 | |
| This function is equivalent to @code{gcvt} except that it takes a
 | |
| @code{long double} for the first parameter and that @var{ndigit} is
 | |
| restricted by the precision of a @code{long double}.
 | |
| @end deftypefun
 | |
| 
 | |
| 
 | |
| @cindex gcvt_r
 | |
| The @code{ecvt} and @code{fcvt} functions, and their @code{long double}
 | |
| equivalents, all return a string located in a static buffer which is
 | |
| overwritten by the next call to the function.  The GNU C library
 | |
| provides another set of extended functions which write the converted
 | |
| string into a user-supplied buffer.  These have the conventional
 | |
| @code{_r} suffix.
 | |
| 
 | |
| @code{gcvt_r} is not necessary, because @code{gcvt} already uses a
 | |
| user-supplied buffer.
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment GNU
 | |
| @deftypefun {char *} ecvt_r (double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}, char *@var{buf}, size_t @var{len})
 | |
| The @code{ecvt_r} function is the same as @code{ecvt}, except
 | |
| that it places its result into the user-specified buffer pointed to by
 | |
| @var{buf}, with length @var{len}.
 | |
| 
 | |
| This function is a GNU extension.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment SVID, Unix98
 | |
| @deftypefun {char *} fcvt_r (double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}, char *@var{buf}, size_t @var{len})
 | |
| The @code{fcvt_r} function is the same as @code{fcvt}, except
 | |
| that it places its result into the user-specified buffer pointed to by
 | |
| @var{buf}, with length @var{len}.
 | |
| 
 | |
| This function is a GNU extension.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment GNU
 | |
| @deftypefun {char *} qecvt_r (long double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}, char *@var{buf}, size_t @var{len})
 | |
| The @code{qecvt_r} function is the same as @code{qecvt}, except
 | |
| that it places its result into the user-specified buffer pointed to by
 | |
| @var{buf}, with length @var{len}.
 | |
| 
 | |
| This function is a GNU extension.
 | |
| @end deftypefun
 | |
| 
 | |
| @comment stdlib.h
 | |
| @comment GNU
 | |
| @deftypefun {char *} qfcvt_r (long double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}, char *@var{buf}, size_t @var{len})
 | |
| The @code{qfcvt_r} function is the same as @code{qfcvt}, except
 | |
| that it places its result into the user-specified buffer pointed to by
 | |
| @var{buf}, with length @var{len}.
 | |
| 
 | |
| This function is a GNU extension.
 | |
| @end deftypefun
 |