@dircategory Scientific software * gsl-ref: (gsl-ref). GNU Scientific Library -- Reference
@copying Copyright (C) 1996, 1997, 1998, 1999, 2000, 2001, 2002 The GSL Team.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License".
The Texinfo source for this manual may be obtained from
ftp.gnu.org in the directory /gnu/gsl/.
Los Alamos National Laboratory
Department of Computer Science, Georgia Institute of Technology
Astrophysics and Radiation Measurements Group, Los Alamos National Laboratory
Network Theory Limited
Theoretical Astrophysics Group, Los Alamos National Laboratory
Department of Physics and Astronomy, The Johns Hopkins University
University of Paris-Dauphine @insertcopying
The GNU Scientific Library (GSL) is a collection of routines for numerical computing. The routines have been written from scratch in C, and present a modern Applications Programming Interface (API) for C programmers, allowing wrappers to be written for very high level languages. The source code is distributed under the GNU General Public License.
The library covers a wide range of topics in numerical computing. Routines are available for the following areas,
| Complex Numbers | Roots of Polynomials | |
| Special Functions | Vectors and Matrices | |
| Permutations | Combinations | |
| Sorting | BLAS Support | |
| Linear Algebra | BLAS Support | |
| Fast Fourier Transforms | Eigensystems | |
| Random Numbers | Quadrature | |
| Random Distributions | Quasi-Random Sequences | |
| Histograms | Statistics | |
| Monte Carlo Integration | N-Tuples | |
| Differential Equations | Simulated Annealing | |
| Numerical Differentiation | Interpolation | |
| Series Acceleration | Chebyshev Approximations | |
| Root-Finding | Discrete Hankel Transforms | |
| Least-Squares Fitting | Minimization | |
| IEEE Floating-Point | Physical Constants |
The use of these routines is described in this manual. Each chapter provides detailed definitions of the functions, followed by example programs and references to the articles on which the algorithms are based.
The subroutines in the GNU Scientific Library are "free software"; this means that everyone is free to use them, and to redistribute them in other free programs. The library is not in the public domain; it is copyrighted and there are conditions on its distribution. These conditions are designed to permit everything that a good cooperating citizen would want to do. What is not allowed is to try to prevent others from further sharing any version of the software that they might get from you.
Specifically, we want to make sure that you have the right to give away copies of any programs related to the GNU Scientific Library, that you receive their source code or else can get it if you want it, that you can change these programs or use pieces of them in new free programs, and that you know you can do these things.
To make sure that everyone has such rights, we have to forbid you to deprive anyone else of these rights. For example, if you distribute copies of any related code which uses the GNU Scientific Library, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must tell them their rights. This means that the library should not be redistributed in proprietary programs.
Also, for our own protection, we must make certain that everyone finds out that there is no warranty for the GNU Scientific Library. If these programs are modified by someone else and passed on, we want their recipients to know that what they have is not what we distributed, so that any problems introduced by others will not reflect on our reputation.
The precise conditions for the distribution of software related to the GNU Scientific Library are found in the GNU General Public License (see section GNU General Public License). Further information about this license is available from the GNU Project webpage Frequently Asked Questions about the GNU GPL,
The source code for the library can be obtained in different ways, by copying it from a friend, purchasing it on CDROM or downloading it from the internet. A list of public ftp servers which carry the source code can be found on the GNU website,
The preferred platform for the library is a GNU system, which allows it to take advantage of additional features in the GNU C compiler and GNU C library. However, the library is fully portable and compiles on most Unix platforms. It is also available for Microsoft Windows. Precompiled versions of the library can be purchased from commercial redistributors listed on the website.
Announcements of new releases, updates and other relevant events are
made on the gsl-announce mailing list. To subscribe to this
low-volume list, send an email of the following form,
To: gsl-announce-request@sources.redhat.com Subject: subscribe
You will receive a response asking to you to reply in order to confirm your subscription.
The following short program demonstrates the use of the library by computing the value of the Bessel function J_0(x) for x=5,
#include <stdio.h>
#include <gsl/gsl_sf_bessel.h>
int
main (void)
{
double x = 5.0;
double y = gsl_sf_bessel_J0 (x);
printf("J0(%g) = %.18e\n", x, y);
return 0;
}
The output is shown below, and should be correct to double-precision accuracy,
J0(5) = -1.775967713143382920e-01
The steps needed to compile programs which use the library are described in the next chapter.
The software described in this manual has no warranty, it is provided "as is". It is your responsibility to validate the behavior of the routines and their accuracy using the source code provided. Consult the GNU General Public license for further details (see section GNU General Public License).
Additional information, including online copies of this manual, links to related projects, and mailing list archives are available from the development website mentioned above. The developers of the library can be reached via the project's public mailing list,
gsl-discuss@sources.redhat.com
This mailing list can be used to ask questions not covered by this manual.
This chapter describes how to compile programs that use GSL, and introduces its conventions.
The library is written in ANSI C and is intended to conform to the ANSI C standard. It should be portable to any system with a working ANSI C compiler.
The library does not rely on any non-ANSI extensions in the interface it exports to the user. Programs you write using GSL can be ANSI compliant. Extensions which can be used in a way compatible with pure ANSI C are supported, however, via conditional compilation. This allows the library to take advantage of compiler extensions on those platforms which support them.
When an ANSI C feature is known to be broken on a particular system the library will exclude any related functions at compile-time. This should make it impossible to link a program that would use these functions and give incorrect results.
To avoid namespace conflicts all exported function names and variables
have the prefix gsl_, while exported macros have the prefix
GSL_.
The library header files are installed in their own `gsl' directory. You should write any preprocessor include statements with a `gsl/' directory prefix thus,
#include <gsl/gsl_math.h>
If the directory is not installed on the standard search path of your
compiler you will also need to provide its location to the preprocessor
as a command line flag. The default location of the `gsl'
directory is `/usr/local/include/gsl'. A typical compilation
command for a source file `app.c' with the GNU C compiler
gcc is,
gcc -I/usr/local/include -c app.c
This results in an object file `app.o'. The default
include path for gcc searches `/usr/local/include' automatically so
the -I option can be omitted when GSL is installed in its default
location.
The library is installed as a single file, `libgsl.a'. A shared version of the library is also installed on systems that support shared libraries. The default location of these files is `/usr/local/lib'. To link against the library you need to specify both the main library and a supporting CBLAS library, which provides standard basic linear algebra subroutines. A suitable CBLAS implementation is provided in the library `libgslcblas.a' if your system does not provide one. The following example shows how to link an application with the library,
gcc app.o -lgsl -lgslcblas -lm
The following command line shows how you would link the same application with an alternative blas library called `libcblas',
gcc app.o -lgsl -lcblas -lm
For the best performance an optimized platform-specific CBLAS
library should be used for -lcblas. The library must conform to
the CBLAS standard. The ATLAS package provides a portable
high-performance BLAS library with a CBLAS interface. It is
free software and should be installed for any work requiring fast vector
and matrix operations. The following command line will link with the
ATLAS library and its CBLAS interface,
gcc app.o -lgsl -lcblas -latlas -lm
For more information see section BLAS Support.
The program gsl-config provides information on the local version
of the library. For example, the following command shows that the
library has been installed under the directory `/usr/local',
bash$ gsl-config --prefix /usr/local
Further information is available using the command gsl-config --help.
To run a program linked with the shared version of the library it may be
necessary to define the shell variable LD_LIBRARY_PATH to include
the directory where the library is installed. For example,
LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH ./app
To compile a statically linked version of the program instead, use the
-static flag in gcc,
gcc -static app.o -lgsl -lgslcblas -lm
For applications using autoconf the standard macro
AC_CHECK_LIB can be used to link with the library automatically
from a configure script. The library itself depends on the
presence of a CBLAS and math library as well, so these must also be
located before linking with the main libgsl file. The following
commands should be placed in the `configure.in' file to perform
these tests,
AC_CHECK_LIB(m,main) AC_CHECK_LIB(gslcblas,main) AC_CHECK_LIB(gsl,main)
Assuming the libraries are found the output during the configure stage looks like this,
checking for main in -lm... yes checking for main in -lgslcblas... yes checking for main in -lgsl... yes
If the library is found then the tests will define the macros
HAVE_LIBGSL, HAVE_LIBGSLCBLAS, HAVE_LIBM and add
the options -lgsl -lgslcblas -lm to the variable LIBS.
The tests above will find any version of the library. They are suitable for general use, where the versions of the functions are not important. An alternative macro is available in the file `gsl.m4' to test for a specific version of the library. To use this macro simply add the following line to your `configure.in' file instead of the tests above:
AM_PATH_GSL(GSL_VERSION,
[action-if-found],
[action-if-not-found])
The argument GSL_VERSION should be the two or three digit
MAJOR.MINOR or MAJOR.MINOR.MICRO version number of the release
you require. A suitable choice for action-if-not-found is,
AC_MSG_ERROR(could not find required version of GSL)
Then you can add the variables GSL_LIBS and GSL_CFLAGS to
your Makefile.am files to obtain the correct compiler flags.
GSL_LIBS is equal to the output of the gsl-config --libs
command and GSL_CFLAGS is equal to gsl-config --cflags
command. For example,
libgsdv_la_LDFLAGS = \
$(GTK_LIBDIR) \
$(GTK_LIBS) -lgsdvgsl $(GSL_LIBS) -lgslcblas
Note that the macro AM_PATH_GSL needs to use the C compiler so it
should appear in the `configure.in' file before the macro
AC_LANG_CPLUSPLUS for programs that use C++.
The inline keyword is not part of ANSI C and the library does not
export any inline function definitions by default. However, the library
provides optional inline versions of performance-critical functions by
conditional compilation. The inline versions of these functions can be
included by defining the macro HAVE_INLINE when compiling an
application.
gcc -c -DHAVE_INLINE app.c
If you use autoconf this macro can be defined automatically.
The following test should be placed in your `configure.in' file,
AC_C_INLINE if test "$ac_cv_c_inline" != no ; then AC_DEFINE(HAVE_INLINE,1) AC_SUBST(HAVE_INLINE) fi
and the macro will then be defined in the compilation flags or by
including the file `config.h' before any library headers. If you
do not define the macro HAVE_INLINE then the slower non-inlined
versions of the functions will be used instead.
Note that the actual usage of the inline keyword is extern
inline, which eliminates unnecessary function definitions in GCC.
If the form extern inline causes problems with other compilers a
stricter autoconf test can be used, see section Autoconf Macros.
The extended numerical type long double is part of the ANSI C
standard and should be available in every modern compiler. However, the
precision of long double is platform dependent, and this should
be considered when using it. The IEEE standard only specifies the
minimum precision of extended precision numbers, while the precision of
double is the same on all platforms.
In some system libraries the stdio.h formatted input/output
functions printf and scanf are not implemented correctly
for long double. Undefined or incorrect results are avoided by
testing these functions during the configure stage of library
compilation and eliminating certain GSL functions which depend on them
if necessary. The corresponding line in the configure output
looks like this,
checking whether printf works with long double... no
Consequently when long double formatted input/output does not
work on a given system it should be impossible to link a program which
uses GSL functions dependent on this.
If it is necessary to work on a system which does not support formatted
long double input/output then the options are to use binary
formats or to convert long double results into double for
reading and writing.
To help in writing portable applications GSL provides some
implementations of functions that are found in other libraries, such as
the BSD math library. You can write your application to use the native
versions of these functions, and substitute the GSL versions via a
preprocessor macro if they are unavailable on another platform. The
substitution can be made automatically if you use autoconf. For
example, to test whether the BSD function hypot is available you
can include the following line in the configure file `configure.in'
for your application,
AC_CHECK_FUNCS(hypot)
and place the following macro definitions in the file `config.h.in',
/* Substitute gsl_hypot for missing system hypot */ #ifndef HAVE_HYPOT #define hypot gsl_hypot #endif
The application source files can then use the include command
#include <config.h> to substitute gsl_hypot for each
occurrence of hypot when hypot is not available.
In most circumstances the best strategy is to use the native versions of these functions when available, and fall back to GSL versions otherwise, since this allows your application to take advantage of any platform-specific optimizations in the system library. This is the strategy used within GSL itself.
The main implementation of some functions in the library will not be optimal on all architectures. For example, there are several ways to compute a Gaussian random variate and their relative speeds are platform-dependent. In cases like this the library provides alternate implementations of these functions with the same interface. If you write your application using calls to the standard implementation you can select an alternative version later via a preprocessor definition. It is also possible to introduce your own optimized functions this way while retaining portability. The following lines demonstrate the use of a platform-dependent choice of methods for sampling from the Gaussian distribution,
#ifdef SPARC #define gsl_ran_gaussian gsl_ran_gaussian_ratio_method #endif #ifdef INTEL #define gsl_ran_gaussian my_gaussian #endif
These lines would be placed in the configuration header file `config.h' of the application, which should then be included by all the source files. Note that the alternative implementations will not produce bit-for-bit identical results, and in the case of random number distributions will produce an entirely different stream of random variates.
Many functions in the library are defined for different numeric types.
This feature is implemented by varying the name of the function with a
type-related modifier -- a primitive form of C++ templates. The
modifier is inserted into the function name after the initial module
prefix. The following table shows the function names defined for all
the numeric types of an imaginary module gsl_foo with function
fn,
gsl_foo_fn double gsl_foo_long_double_fn long double gsl_foo_float_fn float gsl_foo_long_fn long gsl_foo_ulong_fn unsigned long gsl_foo_int_fn int gsl_foo_uint_fn unsigned int gsl_foo_short_fn short gsl_foo_ushort_fn unsigned short gsl_foo_char_fn char gsl_foo_uchar_fn unsigned char
The normal numeric precision double is considered the default and
does not require a suffix. For example, the function
gsl_stats_mean computes the mean of double precision numbers,
while the function gsl_stats_int_mean computes the mean of
integers.
A corresponding scheme is used for library defined types, such as
gsl_vector and gsl_matrix. In this case the modifier is
appended to the type name. For example, if a module defines a new
type-dependent struct or typedef gsl_foo it is modified for other
types in the following way,
gsl_foo double gsl_foo_long_double long double gsl_foo_float float gsl_foo_long long gsl_foo_ulong unsigned long gsl_foo_int int gsl_foo_uint unsigned int gsl_foo_short short gsl_foo_ushort unsigned short gsl_foo_char char gsl_foo_uchar unsigned char
When a module contains type-dependent definitions the library provides individual header files for each type. The filenames are modified as shown in the below. For convenience the default header includes the definitions for all the types. To include only the double precision header, or any other specific type, file use its individual filename.
#include <gsl/gsl_foo.h> All types #include <gsl/gsl_foo_double.h> double #include <gsl/gsl_foo_long_double.h> long double #include <gsl/gsl_foo_float.h> float #include <gsl/gsl_foo_long.h> long #include <gsl/gsl_foo_ulong.h> unsigned long #include <gsl/gsl_foo_int.h> int #include <gsl/gsl_foo_uint.h> unsigned int #include <gsl/gsl_foo_short.h> short #include <gsl/gsl_foo_ushort.h> unsigned short #include <gsl/gsl_foo_char.h> char #include <gsl/gsl_foo_uchar.h> unsigned char
The library header files automatically define functions to have
extern "C" linkage when included in C++ programs.
The library assumes that arrays, vectors and matrices passed as
modifiable arguments are not aliased and do not overlap with each other.
This removes the need for the library to handle overlapping memory
regions as a special case, and allows additional optimizations to be
used. If overlapping memory regions are passed as modifiable arguments
then the results of such functions will be undefined. If the arguments
will not be modified (for example, if a function prototype declares them
as const arguments) then overlapping or aliased memory regions
can be safely used.
The library can be used in multi-threaded programs. All the functions
are thread-safe, in the sense that they do not use static variables.
Memory is always associated with objects and not with functions. For
functions which use workspace objects as temporary storage the
workspaces should be allocated on a per-thread basis. For functions
which use table objects as read-only memory the tables can be used
by multiple threads simultaneously. Table arguments are always declared
const in function prototypes, to indicate that they may be
safely accessed by different threads.
There are a small number of static global variables which are used to control the overall behavior of the library (e.g. whether to use range-checking, the function to call on fatal error, etc). These variables are set directly by the user, so they should be initialized once at program startup and not modified by different threads.
Where possible the routines in the library have been written to avoid
dependencies between modules and files. This should make it possible to
extract individual functions for use in your own applications, without
needing to have the whole library installed. You may need to define
certain macros such as GSL_ERROR and remove some #include
statements in order to compile the files as standalone units. Reuse of
the library code in this way is encouraged, subject to the terms of the
GNU General Public License.
This chapter describes the way that GSL functions report and handle errors. By examining the status information returned by every function you can determine whether it succeeded or failed, and if it failed you can find out what the precise cause of failure was. You can also define your own error handling functions to modify the default behavior of the library.
The functions described in this section are declared in the header file `gsl_errno.h'.
The library follows the thread-safe error reporting conventions of the
POSIX Threads library. Functions return a non-zero error code to
indicate an error and 0 to indicate success.
int status = gsl_function(...)
if (status) { /* an error occurred */
.....
/* status value specifies the type of error */
}
The routines report an error whenever they cannot perform the task requested of them. For example, a root-finding function would return a non-zero error code if could not converge to the requested accuracy, or exceeded a limit on the number of iterations. Situations like this are a normal occurrence when using any mathematical library and you should check the return status of the functions that you call.
Whenever a routine reports an error the return value specifies the type
of error. The return value is analogous to the value of the variable
errno in the C library. The caller can examine the return code
and decide what action to take, including ignoring the error if it is
not considered serious.
In addition to reporting errors by return codes the library also has an
error handler function gsl_error. This function is called by
other library functions when they report an error, just before they
return to the caller. The default behavior of the error handler is to
print a message and abort the program,
gsl: file.c:67: ERROR: invalid argument supplied by user Default GSL error handler invoked. Aborted
The purpose of the gsl_error handler is to provide a function
where a breakpoint can be set that will catch library errors when
running under the debugger. It is not intended for use in production
programs, which should handle any errors using the return codes.
The error code numbers returned by library functions are defined in the
file `gsl_errno.h'. They all have the prefix GSL_ and
expand to non-zero constant integer values. Many of the error codes use
the same base name as a corresponding error code in C library. Here are
some of the most common error codes,
malloc.
The error codes can be converted into an error message using the
function gsl_strerror.
printf("error: %s\n", gsl_strerror (status));
would print an error message like error: output range error for a
status value of GSL_ERANGE.
The default behavior of the GSL error handler is to print a short
message and call abort(). When this default is in use programs
will stop with a core-dump whenever a library routine reports an error.
This is intended as a fail-safe default for programs which do not check
the return status of library routines (we don't encourage you to write
programs this way).
If you turn off the default error handler it is your responsibility to check the return values of routines and handle them yourself. You can also customize the error behavior by providing a new error handler. For example, an alternative error handler could log all errors to a file, ignore certain error conditions (such as underflows), or start the debugger and attach it to the current process when an error occurs.
All GSL error handlers have the type gsl_error_handler_t, which is
defined in `gsl_errno.h',
This is the type of GSL error handler functions. An error handler will
be passed four arguments which specify the reason for the error (a
string), the name of the source file in which it occurred (also a
string), the line number in that file (an integer) and the error number
(an integer). The source file and line number are set at compile time
using the __FILE__ and __LINE__ directives in the
preprocessor. An error handler function returns type void.
Error handler functions should be defined like this,
void handler (const char * reason,
const char * file,
int line,
int gsl_errno)
To request the use of your own error handler you need to call the
function gsl_set_error_handler which is also declared in
`gsl_errno.h',
This functions sets a new error handler, new_handler, for the GSL library routines. The previous handler is returned (so that you can restore it later). Note that the pointer to a user defined error handler function is stored in a static variable, so there can be only one error handler per program. This function should be not be used in multi-threaded programs except to set up a program-wide error handler from a master thread. The following example shows how to set and restore a new error handler,
/* save original handler, install new handler */ old_handler = gsl_set_error_handler (&my_handler); /* code uses new handler */ ..... /* restore original handler */ gsl_set_error_handler (old_handler);
To use the default behavior (abort on error) set the error
handler to NULL,
old_handler = gsl_set_error_handler (NULL);
The error behavior can be changed for specific applications by
recompiling the library with a customized definition of the
GSL_ERROR macro in the file `gsl_errno.h'.
If you are writing numerical functions in a program which also uses GSL code you may find it convenient to adopt the same error reporting conventions as in the library.
To report an error you need to call the function gsl_error with a
string describing the error and then return an appropriate error code
from gsl_errno.h, or a special value, such as NaN. For
convenience the file `gsl_errno.h' defines two macros which carry
out these steps:
This macro reports an error using the GSL conventions and returns a
status value of gsl_errno. It expands to the following code fragment,
gsl_error (reason, __FILE__, __LINE__, gsl_errno); return gsl_errno;
The macro definition in `gsl_errno.h' actually wraps the code
in a do { ... } while (0) block to prevent possible
parsing problems.
Here is an example of how the macro could be used to report that a
routine did not achieve a requested tolerance. To report the error the
routine needs to return the error code GSL_ETOL.
if (residual > tolerance)
{
GSL_ERROR("residual exceeds tolerance", GSL_ETOL);
}
This macro is the same as GSL_ERROR but returns a user-defined
status value of value instead of an error code. It can be used for
mathematical functions that return a floating point value.
The following example shows how to return a NaN at a mathematical
singularity using the GSL_ERROR_VAL macro,
if (x == 0)
{
GSL_ERROR_VAL("argument lies on singularity",
GSL_ERANGE, GSL_NAN);
}
Here is an example of some code which checks the return value of a function where an error might be reported,
#include <stdio.h>
#include <gsl/gsl_errno.h>
#include <gsl/gsl_fft_complex.h>
int
main (void)
{
int status;
gsl_set_error_handler_off();
status = gsl_fft_complex_radix2_forward (data, n);
if (status) {
if (status == GSL_EINVAL) {
fprintf (stderr, "invalid argument, n=%d\n", n);
} else {
fprintf (stderr, "failed, gsl_errno=%d\n",
status);
}
exit (-1);
}
exit (0);
}
The function gsl_fft_complex_radix2 only accepts integer lengths
which are a power of two. If the variable n is not a power of
two then the call to the library function will return GSL_EINVAL,
indicating that the length argument is invalid. The function call to
gsl_set_error_handler_off() stops the default error handler from
aborting the program. The else clause catches any other possible
errors.
This chapter describes basic mathematical functions. Some of these functions are present in system libraries, but the alternative versions given here can be used as a substitute when the system functions are not available.
The functions and macros described in this chapter are defined in the header file `gsl_math.h'.
The library ensures that the standard BSD mathematical constants are defined. For reference here is a list of the constants.
M_E
M_LOG2E
M_LOG10E
M_SQRT2
M_SQRT1_2
M_SQRT3
M_PI
M_PI_2
M_PI_4
M_SQRTPI
M_2_SQRTPI
M_1_PI
M_2_PI
M_LN10
M_LN2
M_LNPI
M_EULER
+1.0/0.0.
-1.0/0.0.
NaN. It is computed from the ratio 0.0/0.0.
The following routines provide portable implementations of functions
found in the BSD math library. When native versions are not available
the functions described here can be used instead. The substitution can
be made automatically if you use autoconf to compile your
application (see section Portability functions).
log1p(x).
expm1(x).
hypot(x,y).
acosh(x).
asinh(x).
atanh(x).
ldexp(x).
frexp(x, e).
A common complaint about the standard C library is its lack of a function for calculating (small) integer powers. GSL provides a simple functions to fill this gap. For reasons of efficiency, these functions do not check for overflow or underflow conditions.
gsl_sf_pow_int_e.
#include <gsl/gsl_math.h> double y = gsl_pow_4 (3.141) /* compute 3.141**4 */
((x) >= 0
? 1 : -1). Note that with this definition the sign of zero is positive
(regardless of its IEEE sign bit).
GSL_IS_ODD(n). It evaluates to 1 if
n is even and 0 if n is odd. The argument n must be of
integer type.
((a) > (b) ? (a):(b)).
((a) < (b) ? (a):(b)).
GSL_MAX will be automatically substituted.
GSL_MIN will be automatically substituted.
GSL_MAX or GSL_MIN
will be automatically substituted.
GSL_MAX or GSL_MIN
will be automatically substituted.
It is sometimes useful to be able to compare two floating point numbers approximately, to allow for rounding and truncation errors. The following function implements the approximate floating-point comparison algorithm proposed by D.E. Knuth in Section 4.2.2 of Seminumerical Algorithms (3rd edition).
The relative accuracy is measured using an interval of size 2
\delta, where \delta = 2^k \epsilon and k is the
maximimum base-2 exponent of x and y as computed by the
function frexp().
If x and y lie within this interval, they are considered approximately equal and the function returns 0. Otherwise if x < y, the function returns -1, or if x > y, the function returns +1.
The implementation is based on the package fcmp by T.C. Belding.
The functions described in this chapter provide support for complex numbers. The algorithms take care to avoid unnecessary intermediate underflows and overflows, allowing the functions to be evaluated over as much of the complex plane as possible.
For multiple-valued functions the branch cuts have been chosen to follow the conventions of Abramowitz and Stegun in the Handbook of Mathematical Functions. The functions return principal values which are the same as those in GNU Calc, which in turn are the same as those in Common Lisp, The Language (Second Edition) (n.b. The second edition uses different definitions from the first edition) and the HP-28/48 series of calculators.
The complex types are defined in the header file `gsl_complex.h', while the corresponding complex functions and arithmetic operations are defined in `gsl_complex_math.h'.
Complex numbers are represented using the type gsl_complex. The
internal representation of this type may vary across platforms and
should not be accessed directly. The functions and macros described
below allow complex numbers to be manipulated in a portable way.
For reference, the default form of the gsl_complex type is
given by the following struct,
typedef struct
{
double dat[2];
} gsl_complex;
The real and imaginary part are stored in contiguous elements of a two
element array. This eliminates any padding between the real and
imaginary parts, dat[0] and dat[1], allowing the struct to
be mapped correctly onto packed complex arrays.
GSL_SET_COMPLEX(&z, 3, 4)
sets z to be 3 + 4i.
log(gsl_complex_abs(z)) would lead to a loss of
precision in this case.
The implementations of the elementary and trigonometric functions are based on the following papers,
The general formulas and details of branch cuts can be found in the following books,
This chapter describes functions for evaluating and solving polynomials.
There are routines for finding real and complex roots of quadratic and
cubic equations using analytic methods. An iterative polynomial solver
is also available for finding the roots of general polynomials with real
coefficients (of any order). The functions are declared in the header
file gsl_poly.h.
The functions described here manipulate polynomials stored in Newton's divided-difference representation. The use of divided-differences is described in Abramowitz & Stegun sections 25.1.4, 25.2.26.
The number of real roots (either zero or two) is returned, and their locations are stored in x0 and x1. If no real roots are found then x0 and x1 are not modified. When two real roots are found they are stored in x0 and x1 in ascending order. The case of coincident roots is not considered special. For example (x-1)^2=0 will have two roots, which happen to have exactly equal values.
The number of roots found depends on the sign of the discriminant b^2 - 4 a c. This will be subject to rounding and cancellation errors when computed in double precision, and will also be subject to errors if the coefficients of the polynomial are inexact. These errors may cause a discrete change in the number of roots. However, for polynomials with small integer coefficients the discriminant can always be computed exactly.
This function finds the complex roots of the quadratic equation,
The number of complex roots is returned (always two) and the locations of the roots are stored in z0 and z1. The roots are returned in ascending order, sorted first by their real components and then by their imaginary components.
This function finds the real roots of the cubic equation,
with a leading coefficient of unity. The number of real roots (either one or three) is returned, and their locations are stored in x0, x1 and x2. If one real root is found then only x0 is modified. When three real roots are found they are stored in x0, x1 and x2 in ascending order. The case of coincident roots is not considered special. For example, the equation (x-1)^3=0 will have three roots with exactly equal values.
This function finds the complex roots of the cubic equation,
The number of complex roots is returned (always three) and the locations of the roots are stored in z0, z1 and z2. The roots are returned in ascending order, sorted first by their real components and then by their imaginary components.
The roots of polynomial equations cannot be found analytically beyond the special cases of the quadratic, cubic and quartic equation. The algorithm described in this section uses an iterative method to find the approximate locations of roots of higher order polynomials.
gsl_poly_complex_workspace
struct and a workspace suitable for solving a polynomial with n
coefficients using the routine gsl_poly_complex_solve.
The function returns a pointer to the newly allocated
gsl_poly_complex_workspace if no errors were detected, and a null
pointer in the case of error.
The function returns GSL_SUCCESS if all the roots are found and
GSL_EFAILED if the QR reduction does not converge.
To demonstrate the use of the general polynomial solver we will take the polynomial P(x) = x^5 - 1 which has the following roots,
The following program will find these roots.
#include <stdio.h>
#include <gsl/gsl_poly.h>
int
main (void)
{
int i;
/* coefficient of P(x) = -1 + x^5 */
double a[6] = { -1, 0, 0, 0, 0, 1 };
double z[10];
gsl_poly_complex_workspace * w
= gsl_poly_complex_workspace_alloc (6);
gsl_poly_complex_solve (a, 6, w, z);
gsl_poly_complex_workspace_free (w);
for (i = 0; i < 5; i++)
{
printf("z%d = %+.18f %+.18f\n",
i, z[2*i], z[2*i+1]);
}
return 0;
}
The output of the program is,
bash$ ./a.out z0 = -0.809016994374947451 +0.587785252292473137 z1 = -0.809016994374947451 -0.587785252292473137 z2 = +0.309016994374947451 +0.951056516295153642 z3 = +0.309016994374947451 -0.951056516295153642 z4 = +1.000000000000000000 +0.000000000000000000
which agrees with the analytic result, z_n = \exp(2 \pi n i/5).
The balanced-QR method and its error analysis is described in the following papers.
This chapter describes the GSL special function library. The library includes routines for calculating the values of Airy functions, Bessel functions, Clausen functions, Coulomb wave functions, Coupling coefficients, the Dawson function, Debye functions, Dilogarithms, Elliptic integrals, Jacobi elliptic functions, Error functions, Exponential integrals, Fermi-Dirac functions, Gamma functions, Gegenbauer functions, Hypergeometric functions, Laguerre functions, Legendre functions and Spherical Harmonics, the Psi (Digamma) Function, Synchrotron functions, Transport functions, Trigonometric functions and Zeta functions. Each routine also computes an estimate of the numerical error in the calculated value of the function.
The functions are declared in individual header files, such as `gsl_sf_airy.h', `gsl_sf_bessel.h', etc. The complete set of header files can be included using the file `gsl_sf.h'.
The special functions are available in two calling conventions, a natural form which returns the numerical value of the function and an error-handling form which returns an error code. The two types of function provide alternative ways of accessing the same underlying code.
The natural form returns only the value of the function and can be used directly in mathematical expressions.. For example, the following function call will compute the value of the Bessel function J_0(x),
double y = gsl_sf_bessel_J0 (x);
There is no way to access an error code or to estimate the error using this method. To allow access to this information the alternative error-handling form stores the value and error in a modifiable argument,
gsl_sf_result result; int status = gsl_sf_bessel_J0_e (x, &result);
The error-handling functions have the suffix _e. The returned
status value indicates error conditions such as overflow, underflow or
loss of precision. If there are no errors the error-handling functions
return GSL_SUCCESS.
The error handling form of the special functions always calculate an error estimate along with the value of the result. Therefore, structures are provided for amalgamating a value and error estimate. These structures are declared in the header file `gsl_sf_result.h'.
The gsl_sf_result struct contains value and error fields.
typedef struct
{
double val;
double err;
} gsl_sf_result;
The field val contains the value and the field err contains an estimate of the absolute error in the value.
In some cases, an overflow or underflow can be detected and handled by a
function. In this case, it may be possible to return a scaling exponent
as well as an error/value pair in order to save the result from
exceeding the dynamic range of the built-in types. The
gsl_sf_result_e10 struct contains value and error fields as well
as an exponent field such that the actual result is obtained as
result * 10^(e10).
typedef struct
{
double val;
double err;
int e10;
} gsl_sf_result_e10;
The goal of the library is to achieve double precision accuracy wherever
possible. However the cost of evaluating some special functions to
double precision can be significant, particularly where very high order
terms are required. In these cases a mode argument allows the
accuracy of the function to be reduced in order to improve performance.
The following precision levels are available for the mode argument,
GSL_PREC_DOUBLE
GSL_PREC_SINGLE
GSL_PREC_APPROX
The approximate mode provides the fastest evaluation at the lowest accuracy.
The Airy functions Ai(x) and Bi(x) are defined by the integral representations,
For further information see Abramowitz & Stegun, Section 10.4. The Airy functions are defined in the header file `gsl_sf_airy.h'.
The routines described in this section compute the Cylindrical Bessel functions J_n(x), Y_n(x), Modified cylindrical Bessel functions I_n(x), K_n(x), Spherical Bessel functions j_l(x), y_l(x), and Modified Spherical Bessel functions i_l(x), k_l(x). For more information see Abramowitz & Stegun, Chapters 9 and 10. The Bessel functions are defined in the header file `gsl_sf_bessel.h'.
The regular modified spherical Bessel functions i_l(x) are related to the modified Bessel functions of fractional order, i_l(x) = \sqrt{\pi/(2x)} I_{l+1/2}(x)
The irregular modified spherical Bessel functions k_l(x) are related to the irregular modified Bessel functions of fractional order, k_l(x) = \sqrt{\pi/(2x)} K_{l+1/2}(x).
The Clausen function is defined by the following integral, It is related to the dilogarithm by Cl_2(\theta) = \Im Li_2(\exp(i \theta)). The Clausen functions are declared in the header file `gsl_sf_clausen.h'.
The Coulomb functions are declared in the header file `gsl_sf_coulomb.h'. Both bound state and scattering solutions are available.
The Coulomb wave functions F_L(\eta,x), G_L(\eta,x) are
described in Abramowitz & Stegun, Chapter 14. Because there can be a
large dynamic range of values for these functions, overflows are handled
gracefully. If an overflow occurs, GSL_EOVRFLW is signalled and
exponent(s) are returned through the modifiable parameters exp_F,
exp_G. The full solution can be reconstructed from the following
relations,
GSL_EOVRFLW is returned and scaling exponents are stored in
the modifiable parameters exp_F, exp_G.
The Coulomb wave function normalization constant is defined in Abramowitz 14.1.7.
The Wigner 3-j, 6-j and 9-j symbols give the coupling coefficients for combined angular momentum vectors. Since the arguments of the standard coupling coefficient functions are integer or half-integer, the arguments of the following functions are, by convention, integers equal to twice the actual spin value. For information on the 3-j coefficients see Abramowitz & Stegun, Section 27.9. The functions described in this section are declared in the header file `gsl_sf_coupling.h'.
where the arguments are given in half-integer units, ja = two_ja/2, ma = two_ma/2, etc.
where the arguments are given in half-integer units, ja = two_ja/2, ma = two_ma/2, etc.
where the arguments are given in half-integer units, ja = two_ja/2, ma = two_ma/2, etc.
The Dawson integral is defined by \exp(-x^2) \int_0^x dt \exp(t^2). A table of Dawson's integral can be found in Abramowitz & Stegun, Table 7.5. The Dawson functions are declared in the header file `gsl_sf_dawson.h'.
The Debye functions are defined by the integral D_n(x) = n/x^n \int_0^x dt (t^n/(e^t - 1)). For further information see Abramowitz & Stegun, Section 27.1. The Debye functions are declared in the header file `gsl_sf_debye.h'.
The functions described in this section are declared in the header file `gsl_sf_dilog.h'.
The following functions allow for the propagation of errors when combining quantities by multiplication. The functions are declared in the header file `gsl_sf_elementary.h'.
The functions described in this section are declared in the header file `gsl_sf_ellint.h'.
The Legendre forms of elliptic integrals F(\phi,k), E(\phi,k) and P(\phi,k,n) are defined by,
The complete Legendre forms are denoted by K(k) = F(\pi/2, k) and E(k) = E(\pi/2, k). Further information on the Legendre forms of elliptic integrals can be found in Abramowitz & Stegun, Chapter 17. The notation used here is based on Carlson, Numerische Mathematik 33 (1979) 1 and differs slightly from that used by Abramowitz & Stegun.
The Carlson symmetric forms of elliptical integrals RC(x,y), RD(x,y,z), RF(x,y,z) and RJ(x,y,z,p) are defined by,
The Jacobian Elliptic functions are defined in Abramowitz & Stegun, Chapter 16. The functions are declared in the header file `gsl_sf_elljac.h'.
The error function is described in Abramowitz & Stegun, Chapter 7. The functions in this section are declared in the header file `gsl_sf_erf.h'.
The probability functions for the Normal or Gaussian distribution are described in Abramowitz & Stegun, Section 26.2.
The functions described in this section are declared in the header file `gsl_sf_exp.h'.
gsl_sf_result_e10 type to return a result with extended range.
This function may be useful if the value of \exp(x) would
overflow the numeric range of double.
gsl_sf_result_e10 type to return a result with extended numeric
range.
gsl_sf_exprel and
gsl_sf_exprel2. The N-relative exponential is given by,
gsl_sf_result_e10 type to return a result with
extended range.
gsl_sf_result_e10 type to return a result with extended range.
Information on the exponential integrals can be found in Abramowitz & Stegun, Chapter 5. These functions are declared in the header file `gsl_sf_expint.h'.
M_EULER).
The functions described in this section are declared in the header file `gsl_sf_fermi_dirac.h'.
The complete Fermi-Dirac integral F_j(x) is given by,
The incomplete Fermi-Dirac integral F_j(x,b) is given by,
The Gamma function is defined by the following integral,
Further information on the Gamma function can be found in Abramowitz & Stegun, Chapter 6. The functions described in this section are declared in the header file `gsl_sf_gamma.h'.
GSL_SF_GAMMA_XMAX
and is 171.0.
and is a useful suggestion of Temme.
GSL_ELOSS error when it occurs. The absolute
value part (lnr), however, never suffers from loss of precision.
gsl_sf_lngamma for n < 170,
but defers for larger n.
n choose m
= n!/(m!(n-m)!)
n choose m. This is
equivalent to the sum \log(n!) - \log(m!) - \log((n-m)!).
Note that Abramowitz & Stegun call P(a,x) the incomplete gamma function (section 6.5).
The Gegenbauer polynomials are defined in Abramowitz & Stegun, Chapter 22, where they are known as Ultraspherical polynomials. The functions described in this section are declared in the header file `gsl_sf_gegenbauer.h'.
Hypergeometric functions are described in Abramowitz & Stegun, Chapters 13 and 15. These functions are declared in the header file `gsl_sf_hyperg.h'.
gsl_sf_result_e10 type to return a result with extended range.
gsl_sf_result_e10 type to return a
result with extended range.
If the arguments (a,b,c,x) are too close to a singularity then
the function can return the error code GSL_EMAXITER when the
series approximation converges too slowly. This occurs in the region of
x=1, c - a - b = m for integer m.
The Laguerre polynomials are defined in terms of confluent hypergeometric functions as L^a_n(x) = ((a+1)_n / n!) 1F1(-n,a+1,x). These functions are declared in the header file `gsl_sf_laguerre.h'.
Lambert's W functions, W(x), are defined to be solutions of the equation W(x) \exp(W(x)) = x. This function has multiple branches for x < 0; however, it has only two real-valued branches. We define W_0(x) to be the principal branch, where W > -1 for x < 0, and W_{-1}(x) to be the other real branch, where W < -1 for x < 0. The Lambert functions are declared in the header file `gsl_sf_lambert.h'.
The Legendre Functions and Legendre Polynomials are described in Abramowitz & Stegun, Chapter 8. These functions are declared in the header file `gsl_sf_legendre.h'.
The following functions compute the associated Legendre Polynomials
P_l^m(x). Note that this function grows combinatorially with
l and can overflow for l larger than about 150. There is
no trouble for small m, but overflow occurs when m and
l are both large. Rather than allow overflows, these functions
refuse to calculate P_l^m(x) and return GSL_EOVRFLW when
they can sense that l and m are too big.
If you want to calculate a spherical harmonic, then do not use
these functions. Instead use gsl_sf_legendre_sphPlm() below,
which uses a similar recursion, but with the normalized functions.
The Conical Functions @c{$P^\mu_{-(1/2)+i\lambda}(x)$} P^\mu_{-(1/2)+i\lambda}(x), @c{$Q^\mu_{-(1/2)+i\lambda}$} Q^\mu_{-(1/2)+i\lambda} are described in Abramowitz & Stegun, Section 8.12.
The following spherical functions are specializations of Legendre functions which give the regular eigenfunctions of the Laplacian on a 3-dimensional hyperbolic space H3d. Of particular interest is the flat limit, \lambda \to \infty, \eta \to 0, \lambda\eta fixed.
Information on the properties of the Logarithm function can be found in Abramowitz & Stegun, Chapter 4. The functions described in this section are declared in the header file `gsl_sf_log.h'.
The following functions are equivalent to the function gsl_pow_int
(see section Small integer powers) with an error estimate. These functions are
declared in the header file `gsl_sf_pow_int.h'.
#include <gsl/gsl_sf_pow_int.h> /* compute 3.0**12 */ double y = gsl_sf_pow_int(3.0, 12);
The polygamma functions of order m defined by \psi^{(m)}(x) = (d/dx)^m \psi(x) = (d/dx)^{m+1} \log(\Gamma(x)), where \psi(x) = \Gamma'(x)/\Gamma(x) is known as the digamma function. These functions are declared in the header file `gsl_sf_psi.h'.
The functions described in this section are declared in the header file `gsl_sf_synchrotron.h'.
The transport functions J(n,x) are defined by the integral representations J(n,x) := \int_0^x dt t^n e^t /(e^t - 1)^2. They are declared in the header file `gsl_sf_transport.h'.
The library includes its own trigonometric functions in order to provide consistency across platforms and reliable error estimates. These functions are declared in the header file `gsl_sf_trig.h'.
The Riemann zeta function is defined in Abramowitz & Stegun, Section 23.2. The functions described in this section are declared in the header file `gsl_sf_zeta.h'.
The Riemann zeta function is defined by the infinite sum \zeta(s) = \sum_{k=1}^\infty k^{-s}.
The Hurwitz zeta function is defined by \zeta(s,q) = \sum_0^\infty (k+q)^{-s}.
The eta function is defined by \eta(s) = (1-2^{1-s}) \zeta(s).
The following example demonstrates the use of the error handling form of the special functions, in this case to compute the Bessel function J_0(5.0),
#include <stdio.h>
#include <gsl/gsl_sf_bessel.h>
int
main (void)
{
double x = 5.0;
gsl_sf_result result;
double expected = -0.17759677131433830434739701;
int status = gsl_sf_bessel_J0_e (x, &result);
printf("status = %s\n", gsl_strerror(status));
printf("J0(5.0) = %.18f\n"
" +/- % .18f\n",
result.val, result.err);
printf("exact = %.18f\n", expected);
return status;
}
Here are the results of running the program,
$ ./a.out
status = success
J0(5.0) = -0.177596771314338292
+/- 0.000000000000000193
exact = -0.177596771314338292
The next program computes the same quantity using the natural form of the function. In this case the error term result.err and return status are not accessible.
#include <stdio.h>
#include <gsl/gsl_sf_bessel.h>
int
main (void)
{
double x = 5.0;
double expected = -0.17759677131433830434739701;
double y = gsl_sf_bessel_J0 (x);
printf("J0(5.0) = %.18f\n", y);
printf("exact = %.18f\n", expected);
return 0;
}
The results of the function are the same,
$ ./a.out J0(5.0) = -0.177596771314338292 exact = -0.177596771314338292
The library follows the conventions of Abramowitz & Stegun where possible,
The following papers contain information on the algorithms used to compute the special functions,
The functions described in this chapter provide a simple vector and matrix interface to ordinary C arrays. The memory management of these arrays is implemented using a single underlying type, known as a block. By writing your functions in terms of vectors and matrices you can pass a single structure containing both data and dimensions as an argument without needing additional function parameters. The structures are compatible with the vector and matrix formats used by BLAS routines.
All the functions are available for each of the standard data-types.
The versions for double have the prefix gsl_block,
gsl_vector and gsl_matrix. Similarly the versions for
single-precision float arrays have the prefix
gsl_block_float, gsl_vector_float and
gsl_matrix_float. The full list of available types is given
below,
gsl_block double gsl_block_float float gsl_block_long_double long double gsl_block_int int gsl_block_uint unsigned int gsl_block_long long gsl_block_ulong unsigned long gsl_block_short short gsl_block_ushort unsigned short gsl_block_char char gsl_block_uchar unsigned char gsl_block_complex complex double gsl_block_complex_float complex float gsl_block_complex_long_double complex long double
Corresponding types exist for the gsl_vector and
gsl_matrix functions.
For consistency all memory is allocated through a gsl_block
structure. The structure contains two components, the size of an area of
memory and a pointer to the memory. The gsl_block structure looks
like this,
typedef struct
{
size_t size;
double * data;
} gsl_block;
Vectors and matrices are made by slicing an underlying block. A slice is a set of elements formed from an initial offset and a combination of indices and step-sizes. In the case of a matrix the step-size for the column index represents the row-length. The step-size for a vector is known as the stride.
The functions for allocating and deallocating blocks are defined in `gsl_block.h'
The functions for allocating memory to a block follow the style of
malloc and free. In addition they also perform their own
error checking. If there is insufficient memory available to allocate a
block then the functions call the GSL error handler (with an error
number of GSL_ENOMEM) in addition to returning a null
pointer. Thus if you use the library error handler to abort your program
then it isn't necessary to check every alloc.
gsl_block_calloc if you want to ensure that all the
elements are initialized to zero.
A null pointer is returned if insufficient memory is available to create the block.
gsl_block_alloc or gsl_block_calloc.
The library provides functions for reading and writing blocks to a file as binary data or formatted text.
GSL_EFAILED if there was a problem writing to the file. Since the
data is written in the native binary format it may not be portable
between different architectures.
GSL_EFAILED if there was a problem reading from the file. The
data is assumed to have been written in the native binary format on the
same architecture.
%g, %e or %f formats for
floating point numbers and %d for integers. The function returns
0 for success and GSL_EFAILED if there was a problem writing to
the file.
GSL_EFAILED if there was a problem reading from the file.
The following program shows how to allocate a block,
#include <stdio.h>
#include <gsl/gsl_block.h>
int
main (void)
{
gsl_block * b = gsl_block_alloc (100);
printf("length of block = %u\n", b->size);
printf("block data address = %#x\n", b->data);
gsl_block_free (b);
return 0;
}
Here is the output from the program,
length of block = 100 block data address = 0x804b0d8
Vectors are defined by a gsl_vector structure which describes a
slice of a block. Different vectors can be created which point to the
same block. A vector slice is a set of equally-spaced elements of an
area of memory.
The gsl_vector structure contains five components, the
size, the stride, a pointer to the memory where the elements
are stored, data, a pointer to the block owned by the vector,
block, if any, and an ownership flag, owner. The structure
is very simple and looks like this,
typedef struct
{
size_t size;
size_t stride;
double * data;
gsl_block * block;
int owner;
} gsl_vector;
The size is simply the number of vector elements. The range of
valid indices runs from 0 to size-1. The stride is the
step-size from one element to the next in physical memory, measured in
units of the appropriate datatype. The pointer data gives the
location of the first element of the vector in memory. The pointer
block stores the location of the memory block in which the vector
elements are located (if any). If the vector owns this block then the
owner field is set to one and the block will be deallocated when the
vector is freed. If the vector points to a block owned by another
object then the owner field is zero and any underlying block will not be
deallocated.
The functions for allocating and accessing vectors are defined in `gsl_vector.h'
The functions for allocating memory to a vector follow the style of
malloc and free. In addition they also perform their own
error checking. If there is insufficient memory available to allocate a
vector then the functions call the GSL error handler (with an error
number of GSL_ENOMEM) in addition to returning a null
pointer. Thus if you use the library error handler to abort your program
then it isn't necessary to check every alloc.
gsl_vector_alloc then the block
underlying the vector will also be deallocated. If the vector has been
created from another object then the memory is still owned by that
object and will not be deallocated.
Unlike FORTRAN compilers, C compilers do not usually provide
support for range checking of vectors and matrices. Range checking is
available in the GNU C Compiler extension checkergcc but it is
not available on every platform. The functions gsl_vector_get
and gsl_vector_set can perform portable range checking for you
and report an error if you attempt to access elements outside the
allowed range.
The functions for accessing the elements of a vector or matrix are
defined in `gsl_vector.h' and declared extern inline to
eliminate function-call overhead. If necessary you can turn off range
checking completely without modifying any source files by recompiling
your program with the preprocessor definition
GSL_RANGE_CHECK_OFF. Provided your compiler supports inline
functions the effect of turning off range checking is to replace calls
to gsl_vector_get(v,i) by v->data[i*v->stride] and
calls to gsl_vector_set(v,i,x) by v->data[i*v->stride]=x.
Thus there should be no performance penalty for using the range checking
functions when range checking is turned off.
The library provides functions for reading and writing vectors to a file as binary data or formatted text.
GSL_EFAILED if there was a problem writing to the file. Since the
data is written in the native binary format it may not be portable
between different architectures.
GSL_EFAILED if there was a problem reading from the file. The
data is assumed to have been written in the native binary format on the
same architecture.
%g, %e or %f formats for
floating point numbers and %d for integers. The function returns
0 for success and GSL_EFAILED if there was a problem writing to
the file.
GSL_EFAILED if there was a problem reading from the file.
In addition to creating vectors from slices of blocks it is also possible to slice vectors and create vector views. For example, a subvector of another vector can be described with a view, or two views can be made which provide access to the even and odd elements of a vector.
A vector view is a temporary object, stored on the stack, which can be
used to operate on a subset of vector elements. Vector views can be
defined for both constant and non-constant vectors, using separate types
that preserve constness. A vector view has the type
gsl_vector_view and a constant vector view has the type
gsl_vector_const_view. In both cases the elements of the view
can be accessed as a gsl_vector using the vector component
of the view object. A pointer to a vector of type gsl_vector *
or const gsl_vector * can be obtained by taking the address of
this component with the & operator.
v'(i) = v->data[(offset + i)*v->stride]
where the index i runs from 0 to n-1.
The data pointer of the returned vector struct is set to null if
the combined parameters (offset,n) overrun the end of the
original vector.
The new vector is only a view of the block underlying the original vector, v. The block containing the elements of v is not owned by the new vector. When the view goes out of scope the original vector v and its block will continue to exist. The original memory can only be deallocated by freeing the original vector. Of course, the original vector should not be deallocated while the view is still in use.
The function gsl_vector_const_subvector is equivalent to
gsl_vector_subvector but can be used for vectors which are
declared const.
gsl_vector_subvector but the new vector has
n elements with a step-size of stride from one element to
the next in the original vector. Mathematically, the i-th element
of the new vector v' is given by,
v'(i) = v->data[(offset + i*stride)*v->stride]
where the index i runs from 0 to n-1.
Note that subvector views give direct access to the underlying elements
of the original vector. For example, the following code will zero the
even elements of the vector v of length n, while leaving the
odd elements untouched,
gsl_vector_view v_even = gsl_vector_subvector_with_stride (v, 0, 2, n/2); gsl_vector_set_zero (&v_even.vector);
A vector view can be passed to any subroutine which takes a vector
argument just as a directly allocated vector would be, using
&view.vector. For example, the following code
computes the norm of odd elements of v using the BLAS
routine DNRM2,
gsl_vector_view v_odd = gsl_vector_subvector_with_stride (v, 1, 2, n/2); double r = gsl_blas_dnrm2 (&v_odd.vector);
The function gsl_vector_const_subvector_with_stride is equivalent
to gsl_vector_subvector_with_stride but can be used for vectors
which are declared const.
The function gsl_vector_complex_const_real is equivalent to
gsl_vector_complex_real but can be used for vectors which are
declared const.
The function gsl_vector_complex_const_imag is equivalent to
gsl_vector_complex_imag but can be used for vectors which are
declared const.
v'(i) = base[i]
where the index i runs from 0 to n-1.
The array containing the elements of v is not owned by the new vector view. When the view goes out of scope the original array will continue to exist. The original memory can only be deallocated by freeing the original pointer base. Of course, the original array should not be deallocated while the view is still in use.
The function gsl_vector_const_view_array is equivalent to
gsl_vector_view_array but can be used for arrays which are
declared const.
gsl_vector_view_array but the new vector has n elements
with a step-size of stride from one element to the next in the
original array. Mathematically, the i-th element of the new
vector v' is given by,
v'(i) = base[i*stride]
where the index i runs from 0 to n-1.
Note that the view gives direct access to the underlying elements of the
original array. A vector view can be passed to any subroutine which
takes a vector argument just as a directly allocated vector would be,
using &view.vector.
The function gsl_vector_const_view_array_with_stride is
equivalent to gsl_vector_view_array_with_stride but can be used
for arrays which are declared const.
Common operations on vectors such as addition and multiplication are available in the BLAS part of the library (see section BLAS Support). However, it is useful to have a small number of utility functions which do not require the full BLAS code. The following functions fall into this category.
The following function can be used to exchange, or permute, the elements of a vector.
The following operations are only defined for real vectors.
This program shows how to allocate, initialize and read from a vector
using the functions gsl_vector_alloc, gsl_vector_set and
gsl_vector_get.
#include <stdio.h>
#include <gsl/gsl_vector.h>
int
main (void)
{
int i;
gsl_vector * v = gsl_vector_alloc (3);
for (i = 0; i < 3; i++)
{
gsl_vector_set (v, i, 1.23 + i);
}
for (i = 0; i < 100; i++)
{
printf("v_%d = %g\n", i, gsl_vector_get (v, i));
}
return 0;
}
Here is the output from the program. The final loop attempts to read
outside the range of the vector v, and the error is trapped by
the range-checking code in gsl_vector_get.
v_0 = 1.23 v_1 = 2.23 v_2 = 3.23 gsl: vector_source.c:12: ERROR: index out of range IOT trap/Abort (core dumped)
The next program shows how to write a vector to a file.
#include <stdio.h>
#include <gsl/gsl_vector.h>
int
main (void)
{
int i;
gsl_vector * v = gsl_vector_alloc (100);
for (i = 0; i < 100; i++)
{
gsl_vector_set (v, i, 1.23 + i);
}
{
FILE * f = fopen("test.dat", "w");
gsl_vector_fprintf (f, v, "%.5g");
fclose (f);
}
return 0;
}
After running this program the file `test.dat' should contain the
elements of v, written using the format specifier
%.5g. The vector could then be read back in using the function
gsl_vector_fscanf (f, v) as follows:
#include <stdio.h>
#include <gsl/gsl_vector.h>
int
main (void)
{
int i;
gsl_vector * v = gsl_vector_alloc (10);
{
FILE * f = fopen("test.dat", "r");
gsl_vector_fscanf (f, v);
fclose (f);
}
for (i = 0; i < 10; i++)
{
printf("%g\n", gsl_vector_get(v, i));
}
return 0;
}
Matrices are defined by a gsl_matrix structure which describes a
generalized slice of a block. Like a vector it represents a set of
elements in an area of memory, but uses two indices instead of one.
The gsl_matrix structure contains six components, the two
dimensions of the matrix, a physical dimension, a pointer to the memory
where the elements of the matrix are stored, data, a pointer to
the block owned by the matrix block, if any, and an ownership
flag, owner. The physical dimension determines the memory layout
and can differ from the matrix dimension to allow the use of
submatrices. The gsl_matrix structure is very simple and looks
like this,
typedef struct
{
size_t size1;
size_t size2;
size_t tda;
double * data;
gsl_block * block;
int owner;
} gsl_matrix;
Matrices are stored in row-major order, meaning that each row of
elements forms a contiguous block in memory. This is the standard
"C-language ordering" of two-dimensional arrays. Note that FORTRAN
stores arrays in column-major order. The number of rows is size1.
The range of valid row indices runs from 0 to size1-1. Similarly
size2 is the number of columns. The range of valid column indices
runs from 0 to size2-1. The physical row dimension tda, or
trailing dimension, specifies the size of a row of the matrix as
laid out in memory.
For example, in the following matrix size1 is 3, size2 is 4, and tda is 8. The physical memory layout of the matrix begins in the top left hand-corner and proceeds from left to right along each row in turn.
00 01 02 03 XX XX XX XX 10 11 12 13 XX XX XX XX 20 21 22 23 XX XX XX XX
Each unused memory location is represented by "XX". The
pointer data gives the location of the first element of the matrix
in memory. The pointer block stores the location of the memory
block in which the elements of the matrix are located (if any). If the
matrix owns this block then the owner field is set to one and the
block will be deallocated when the matrix is freed. If the matrix is
only a slice of a block owned by another object then the owner field is
zero and any underlying block will not be freed.
The functions for allocating and accessing matrices are defined in `gsl_matrix.h'
The functions for allocating memory to a matrix follow the style of
malloc and free. They also perform their own error
checking. If there is insufficient memory available to allocate a vector
then the functions call the GSL error handler (with an error number of
GSL_ENOMEM) in addition to returning a null pointer. Thus if you
use the library error handler to abort your program then it isn't
necessary to check every alloc.
gsl_matrix_alloc then the block
underlying the matrix will also be deallocated. If the matrix has been
created from another object then the memory is still owned by that
object and will not be deallocated.
The functions for accessing the elements of a matrix use the same range
checking system as vectors. You turn off range checking by recompiling
your program with the preprocessor definition
GSL_RANGE_CHECK_OFF.
The elements of the matrix are stored in "C-order", where the second
index moves continuously through memory. More precisely, the element
accessed by the function gsl_matrix_get(m,i,j) and
gsl_matrix_set(m,i,j,x) is
m->data[i * m->tda + j]
where tda is the physical row-length of the matrix.
The library provides functions for reading and writing matrices to a file as binary data or formatted text.
GSL_EFAILED if there was a problem writing to the file. Since the
data is written in the native binary format it may not be portable
between different architectures.
GSL_EFAILED if there was a problem reading from the file. The
data is assumed to have been written in the native binary format on the
same architecture.
%g, %e or %f formats for
floating point numbers and %d for integers. The function returns
0 for success and GSL_EFAILED if there was a problem writing to
the file.
GSL_EFAILED if there was a problem reading from the file.
A matrix view is a temporary object, stored on the stack, which can be
used to operate on a subset of matrix elements. Matrix views can be
defined for both constant and non-constant matrices using separate types
that preserve constness. A matrix view has the type
gsl_matrix_view and a constant matrix view has the type
gsl_matrix_const_view. In both cases the elements of the view
can by accessed using the matrix component of the view object. A
pointer gsl_matrix * or const gsl_matrix * can be obtained
by taking the address of the matrix component with the &
operator. In addition to matrix views it is also possible to create
vector views of a matrix, such as row or column views.
m'(i,j) = m->data[(k1*m->tda + k1) + i*m->tda + j]
where the index i runs from 0 to n1-1 and the index j
runs from 0 to n2-1.
The data pointer of the returned matrix struct is set to null if
the combined parameters (i,j,n1,n2,tda)
overrun the ends of the original matrix.
The new matrix view is only a view of the block underlying the existing matrix, m. The block containing the elements of m is not owned by the new matrix view. When the view goes out of scope the original matrix m and its block will continue to exist. The original memory can only be deallocated by freeing the original matrix. Of course, the original matrix should not be deallocated while the view is still in use.
The function gsl_matrix_const_submatrix is equivalent to
gsl_matrix_submatrix but can be used for matrices which are
declared const.
m'(i,j) = base[i*n2 + j]
where the index i runs from 0 to n1-1 and the index j
runs from 0 to n2-1.
The new matrix is only a view of the array base. When the view goes out of scope the original array base will continue to exist. The original memory can only be deallocated by freeing the original array. Of course, the original array should not be deallocated while the view is still in use.
The function gsl_matrix_const_view_array is equivalent to
gsl_matrix_view_array but can be used for matrices which are
declared const.
m'(i,j) = base[i*tda + j]
where the index i runs from 0 to n1-1 and the index j
runs from 0 to n2-1.
The new matrix is only a view of the array base. When the view goes out of scope the original array base will continue to exist. The original memory can only be deallocated by freeing the original array. Of course, the original array should not be deallocated while the view is still in use.
The function gsl_matrix_const_view_array_with_tda is equivalent
to gsl_matrix_view_array_with_tda but can be used for matrices
which are declared const.
m'(i,j) = v->data[i*n2 + j]
where the index i runs from 0 to n1-1 and the index j
runs from 0 to n2-1.
The new matrix is only a view of the vector v. When the view goes out of scope the original vector v will continue to exist. The original memory can only be deallocated by freeing the original vector. Of course, the original vector should not be deallocated while the view is still in use.
The function gsl_matrix_const_view_vector is equivalent to
gsl_matrix_view_vector but can be used for matrices which are
declared const.
m'(i,j) = v->data[i*tda + j]
where the index i runs from 0 to n1-1 and the index j
runs from 0 to n2-1.
The new matrix is only a view of the vector v. When the view goes out of scope the original vector v will continue to exist. The original memory can only be deallocated by freeing the original vector. Of course, the original vector should not be deallocated while the view is still in use.
The function gsl_matrix_const_view_vector_with_tda is equivalent
to gsl_matrix_view_vector_with_tda but can be used for matrices
which are declared const.
In general there are two ways to access an object, by reference or by copying. The functions described in this section create vector views which allow access to a row or column of a matrix by reference. Modifying elements of the view is equivalent to modifying the matrix, since both the vector view and the matrix point to the same memory block.
data pointer of the new vector is set to null if
i is out of range.
The function gsl_vector_const_row is equivalent to
gsl_matrix_row but can be used for matrices which are declared
const.
data pointer of the new vector is set to
null if j is out of range.
The function gsl_vector_const_column equivalent to
gsl_matrix_column but can be used for matrices which are declared
const.
The function gsl_matrix_const_diagonal is equivalent to
gsl_matrix_diagonal but can be used for matrices which are
declared const.
The function gsl_matrix_const_subdiagonal is equivalent to
gsl_matrix_subdiagonal but can be used for matrices which are
declared const.
The function gsl_matrix_const_superdiagonal is equivalent to
gsl_matrix_superdiagonal but can be used for matrices which are
declared const.
The functions described in this section copy a row or column of a matrix
into a vector. This allows the elements of the vector and the matrix to
be modified independently. Note that if the matrix and the vector point
to overlapping regions of memory then the result will be undefined. The
same effect can be achieved with more generality using
gsl_vector_memcpy with vector views of rows and columns.
The following functions can be used to exchange the rows and columns of a matrix.
The following operations are only defined for real matrices.
The program below shows how to allocate, initialize and read from a matrix
using the functions gsl_matrix_alloc, gsl_matrix_set and
gsl_matrix_get.
#include <stdio.h>
#include <gsl/gsl_matrix.h>
int
main (void)
{
int i, j;
gsl_matrix * m = gsl_matrix_alloc (10, 3);
for (i = 0; i < 10; i++)
for (j = 0; j < 3; j++)
gsl_matrix_set (m, i, j, 0.23 + 100*i + j);
for (i = 0; i < 100; i++)
for (j = 0; j < 3; j++)
printf("m(%d,%d) = %g\n", i, j,
gsl_matrix_get (m, i, j));
return 0;
}
Here is the output from the program. The final loop attempts to read
outside the range of the matrix m, and the error is trapped by
the range-checking code in gsl_matrix_get.
m(0,0) = 0.23 m(0,1) = 1.23 m(0,2) = 2.23 m(1,0) = 100.23 m(1,1) = 101.23 m(1,2) = 102.23 ... m(9,2) = 902.23 gsl: matrix_source.c:13: ERROR: first index out of range IOT trap/Abort (core dumped)
The next program shows how to write a matrix to a file.
#include <stdio.h>
#include <gsl/gsl_matrix.h>
int
main (void)
{
int i, j, k = 0;
gsl_matrix * m = gsl_matrix_alloc (100, 100);
gsl_matrix * a = gsl_matrix_alloc (100, 100);
for (i = 0; i < 100; i++)
for (j = 0; j < 100; j++)
gsl_matrix_set (m, i, j, 0.23 + i + j);
{
FILE * f = fopen("test.dat", "w");
gsl_matrix_fwrite (f, m);
fclose (f);
}
{
FILE * f = fopen("test.dat", "r");
gsl_matrix_fread (f, a);
fclose (f);
}
for (i = 0; i < 100; i++)
for (j = 0; j < 100; j++)
{
double mij = gsl_matrix_get(m, i, j);
double aij = gsl_matrix_get(a, i, j);
if (mij != aij) k++;
}
printf("differences = %d (should be zero)\n", k);
return (k > 0);
}
After running this program the file `test.dat' should contain the
elements of m, written in binary format. The matrix which is read
back in using the function gsl_matrix_fread should be exactly
equal to the original matrix.
The following program demonstrates the use of vector views. The program computes the column-norms of a matrix.
#include <math.h>
#include <stdio.h>
#include <gsl/gsl_matrix.h>
#include <gsl/gsl_blas.h>
int
main (void)
{
size_t i,j;
gsl_matrix *m = gsl_matrix_alloc (10, 10);
for (i = 0; i < 10; i++)
for (j = 0; j < 10; j++)
gsl_matrix_set (m, i, j, sin (i) + cos (j));
for (j = 0; j < 10; j++)
{
gsl_vector_view column = gsl_matrix_column (m, j);
double d;
d = gsl_blas_dnrm2 (&column.vector);
printf ("matrix column %d, norm = %g\n", j, d);
}
gsl_matrix_free (m);
}
Here is the output of the program, which can be confirmed using GNU OCTAVE,
$ ./a.out
matrix column 0, norm = 4.31461
matrix column 1, norm = 3.1205
matrix column 2, norm = 2.19316
matrix column 3, norm = 3.26114
matrix column 4, norm = 2.53416
matrix column 5, norm = 2.57281
matrix column 6, norm = 4.20469
matrix column 7, norm = 3.65202
matrix column 8, norm = 2.08524
matrix column 9, norm = 3.07313
octave> m = sin(0:9)' * ones(1,10)
+ ones(10,1) * cos(0:9);
octave> sqrt(sum(m.^2))
ans =
4.3146 3.1205 2.1932 3.2611 2.5342 2.5728
4.2047 3.6520 2.0852 3.0731
The block, vector and matrix objects in GSL follow the valarray
model of C++. A description of this model can be found in the following
reference,
This chapter describes functions for creating and manipulating permutations. A permutation p is represented by an array of n integers in the range 0 .. n-1, where each value p_i occurs once and only once. The application of a permutation p to a vector v yields a new vector v' where v'_i = v_{p_i}. For example, the array (0,1,3,2) represents a permutation which exchanges the last two elements of a four element vector. The corresponding identity permutation is (0,1,2,3).
Note that the permutations produced by the linear algebra routines correspond to the exchange of matrix columns, and so should be considered as applying to row-vectors in the form v' = v P rather than column-vectors, when permuting the elements of a vector.
The functions described in this chapter are defined in the header file `gsl_permutation.h'.
A permutation is stored by a structure containing two components, the size
of the permutation and a pointer to the permutation array. The elements
of the permutation array are all of type size_t. The
gsl_permutation structure looks like this,
typedef struct
{
size_t size;
size_t * data;
} gsl_permutation;
gsl_permutation_calloc if you want to create a
permutation which is initialized to the identity. A null pointer is
returned if insufficient memory is available to create the permutation.
The following functions can be used to access and manipulate permutations.
GSL_SUCCESS. If no further
permutations are available it returns GSL_FAILURE and leaves
p unmodified. Starting with the identity permutation and
repeatedly applying this function will iterate through all possible
permutations of a given order.
GSL_SUCCESS. If no previous permutation is available it returns
GSL_FAILURE and leaves p unmodified.
The library provides functions for reading and writing permutations to a file as binary data or formatted text.
GSL_EFAILED if there was a problem writing to the file. Since the
data is written in the native binary format it may not be portable
between different architectures.
GSL_EFAILED if there was a problem reading from the file. The
data is assumed to have been written in the native binary format on the
same architecture.
Z represents size_t, so
"%Zu\n" is a suitable format. The function returns
GSL_EFAILED if there was a problem writing to the file.
GSL_EFAILED if there was a problem reading from the file.
A permutation can be represented in both linear and cyclic notations. The functions described in this section can be used to convert between the two forms.
The linear notation is an index mapping, and has already been described above. The cyclic notation represents a permutation as a series of circular rearrangements of groups of elements, or cycles.
Any permutation can be decomposed into a combination of cycles. For example, under the cycle (1 2 3), 1 is replaced by 2, 2 is replaced by 3 and 3 is replaced by 1 in a circular fashion. Cycles of different sets of elements can be combined independently, for example (1 2 3) (4 5) combines the cycle (1 2 3) with the cycle (4 5), which is an exchange of elements 4 and 5. A cycle of length one represents an element which is unchanged by the permutation and is referred to as a singleton.
The cyclic notation for a permutation is not unique, but can be rearranged into a unique canonical form by a reordering of elements. The library uses the canonical form defined in Knuth's Art of Computer Programming (Vol 1, 3rd Ed, 1997) Section 1.3.3, p.178.
The procedure for obtaining the canonical form given by Knuth is,
For example, the linear representation (2 4 3 0 1) is represented as (1 4) (0 2 3) in canonical form. The permutation corresponds to an exchange of elements 1 and 4, and rotation of elements 0, 2 and 3.
The important property of the canonical form is that it can be reconstructed from the contents of each cycle without the brackets. In addition, by removing the brackets it can be considered as a linear representation of a different permutation. In the example given above the permutation (2 4 3 0 1) would become (1 4 0 2 3). This mapping between linear permutations defined by the canonical form has many important uses in the theory of permutations.
The example program below creates a random permutation by shuffling and finds its inverse.
#include <stdio.h>
#include <gsl/gsl_rng.h>
#include <gsl/gsl_randist.h>
#include <gsl/gsl_permutation.h>
int
main (void)
{
const size_t N = 10;
const gsl_rng_type * T;
gsl_rng * r;
gsl_permutation * p = gsl_permutation_alloc (N);
gsl_permutation * q = gsl_permutation_alloc (N);
gsl_rng_env_setup();
T = gsl_rng_default;
r = gsl_rng_alloc (T);
printf("initial permutation:");
gsl_permutation_init (p);
gsl_permutation_fprintf (stdout, p, " %u");
printf("\n");
printf(" random permutation:");
gsl_ran_shuffle (r, p->data, N, sizeof(size_t));
gsl_permutation_fprintf (stdout, p, " %u");
printf("\n");
printf("inverse permutation:");
gsl_permutation_inverse (q, p);
gsl_permutation_fprintf (stdout, q, " %u");
printf("\n");
return 0;
}
Here is the output from the program,
bash$ ./a.out initial permutation: 0 1 2 3 4 5 6 7 8 9 random permutation: 1 3 5 2 7 6 0 4 9 8 inverse permutation: 6 0 3 1 7 2 5 4 9 8
The random permutation p[i] and its inverse q[i] are
related through the identity p[q[i]] = i, which can be verified
from the output.
The next example program steps forwards through all possible 3-rd order permutations, starting from the identity,
#include <stdio.h>
#include <gsl/gsl_permutation.h>
int
main (void)
{
gsl_permutation * p = gsl_permutation_alloc (3);
gsl_permutation_init (p);
do
{
gsl_permutation_fprintf (stdout, p, " %u");
printf("\n");
}
while (gsl_permutation_next(p) == GSL_SUCCESS);
return 0;
}
Here is the output from the program,
bash$ ./a.out 0 1 2 0 2 1 1 0 2 1 2 0 2 0 1 2 1 0
All 6 permutations are generated in lexicographic order. To reverse the
sequence, begin with the final permutation (which is the reverse of the
identity) and replace gsl_permutation_next with
gsl_permutation_prev.
The subject of permutations is covered extensively in Knuth's Sorting and Searching,
For the definition of the canonical form see,
This chapter describes functions for creating and manipulating combinations. A combination c is represented by an array of k integers in the range 0 .. n-1, where each value c_i is from the range 0 .. n-1 and occurs at most once. The combination c corresponds to indices of k elements chosen from an n element vector. Combinations are useful for iterating over all k-element subsets of a set.
The functions described in this chapter are defined in the header file `gsl_combination.h'.
A combination is stored by a structure containing three components, the
values of n and k, and a pointer to the combination array.
The elements of the combination array are all of type size_t, and
are stored in increasing order. The gsl_combination structure
looks like this,
typedef struct
{
size_t n;
size_t k;
size_t *data;
} gsl_combination;
gsl_combination_calloc if you
want to create a combination which is initialized to the
lexicographically first combination. A null pointer is returned if
insufficient memory is available to create the combination.
The following function can be used to access combinations elements.
GSL_SUCCESS. If no further
combinations are available it returns GSL_FAILURE and leaves
c unmodified. Starting with the fisrst combination and
repeatedly applying this function will iterate through all possible
combinations of a given order.
GSL_SUCCESS. If no previous combination is available it returns
GSL_FAILURE and leaves c unmodified.
The library provides functions for reading and writing combinations to a file as binary data or formatted text.
GSL_EFAILED if there was a problem writing to the file. Since the
data is written in the native binary format it may not be portable
between different architectures.
GSL_EFAILED if there was a problem reading
from the file. The data is assumed to have been written in the native
binary format on the same architecture.
Z represents size_t, so
"%Zu\n" is a suitable format. The function returns
GSL_EFAILED if there was a problem writing to the file.
GSL_EFAILED if there was a problem reading from the file.
The example program below prints all subsets of the set {1,2,3,4} ordered by size. Subsets of the same size are ordered lexicographically.
#include <stdio.h>
#include <gsl/gsl_combination.h>
int
main (void)
{
gsl_combination * c;
size_t i;
printf("All subsets of {0,1,2,3} by size:\n") ;
for(i = 0; i <= 4; i++)
{
c = gsl_combination_calloc (4, i);
do
{
printf("{");
gsl_combination_fprintf (stdout, c, " %u");
printf(" }\n");
}
while (gsl_combination_next(c) == GSL_SUCCESS);
gsl_combination_free(c);
}
return 0;
}
Here is the output from the program,
bash$ ./a.out
All subsets of {0,1,2,3} by size:
{ }
{ 0 }
{ 1 }
{ 2 }
{ 3 }
{ 0 1 }
{ 0 2 }
{ 0 3 }
{ 1 2 }
{ 1 3 }
{ 2 3 }
{ 0 1 2 }
{ 0 1 3 }
{ 0 2 3 }
{ 1 2 3 }
{ 0 1 2 3 }
All 16 subsets are generated, and the subsets of each size are sorted lexicographically.
This chapter describes functions for sorting data, both directly and indirectly (using an index). All the functions use the heapsort algorithm. Heapsort is an O(N \log N) algorithm which operates in-place. It does not require any additional storage and provides consistent performance. The running time for its worst-case (ordered data) is not significantly longer than the average and best cases. Note that the heapsort algorithm does not preserve the relative ordering of equal elements -- it is an unstable sort. However the resulting order of equal elements will be consistent across different platforms when using these functions.
The following function provides a simple alternative to the standard
library function qsort. It is intended for systems lacking
qsort, not as a replacement for it. The function qsort
should be used whenever possible, as it will be faster and can provide
stable ordering of equal elements. Documentation for qsort is
available in the GNU C Library Reference Manual.
The functions described in this section are defined in the header file `gsl_heapsort.h'.
This function sorts the count elements of the array array, each of size size, into ascending order using the comparison function compare. The type of the comparison function is defined by,
int (*gsl_comparison_fn_t) (const void * a,
const void * b)
A comparison function should return a negative integer if the first
argument is less than the second argument, 0 if the two arguments
are equal and a positive integer if the first argument is greater than
the second argument.
For example, the following function can be used to sort doubles into ascending numerical order.
int
compare_doubles (const double * a,
const double * b)
{
if (*a > *b)
return 1;
else if (*a < *b)
return -1;
else
return 0;
}
The appropriate function call to perform the sort is,
gsl_heapsort (array, count, sizeof(double),
compare_doubles);
Note that unlike qsort the heapsort algorithm cannot be made into
a stable sort by pointer arithmetic. The trick of comparing pointers for
equal elements in the comparison function does not work for the heapsort
algorithm. The heapsort algorithm performs an internal rearrangement of
the data which destroys its initial ordering.
This function indirectly sorts the count elements of the array array, each of size size, into ascending order using the comparison function compare. The resulting permutation is stored in p, an array of length n. The elements of p give the index of the array element which would have been stored in that position if the array had been sorted in place. The first element of p gives the index of the least element in array, and the last element of p gives the index of the greatest element in array. The array itself is not changed.
The following functions will sort the elements of an array or vector,
either directly or indirectly. They are defined for all real and integer
types using the normal suffix rules. For example, the float
versions of the array functions are gsl_sort_float and
gsl_sort_float_index. The corresponding vector functions are
gsl_sort_vector_float and gsl_sort_vector_float_index. The
prototypes are available in the header files `gsl_sort_float.h'
`gsl_sort_vector_float.h'. The complete set of prototypes can be
included using the header files `gsl_sort.h' and
`gsl_sort_vector.h'.
There are no functions for sorting complex arrays or vectors, since the ordering of complex numbers is not uniquely defined. To sort a complex vector by magnitude compute a real vector containing the magnitudes of the complex elements, and sort this vector indirectly. The resulting index gives the appropriate ordering of the original complex vector.
The functions described in this section select the k-th smallest or largest elements of a data set of size N. The routines use an O(kN) direct insertion algorithm which is suited to subsets that are small compared with the total size of the dataset. For example, the routines are useful for selecting the 10 largest values from one million data points, but not for selecting the largest 100,000 values. If the subset is a significant part of the total dataset it may be faster to sort all the elements of the dataset directly with an O(N \log N) algorithm and obtain the smallest or largest values that way.
The following functions find the indices of the k-th smallest or largest elements of a dataset,
The rank of an element is its order in the sorted data. The rank is the inverse of the index permutation, p. It can be computed using the following algorithm,
for (i = 0; i < p->size; i++)
{
size_t pi = p->data[i];
rank->data[pi] = i;
}
This can be computed directly from the function
gsl_permutation_inverse(rank,p).
The following function will print the rank of each element of the vector v,
void
print_rank (gsl_vector * v)
{
size_t i;
size_t n = v->size;
gsl_permutation * perm = gsl_permutation_alloc(n);
gsl_permutation * rank = gsl_permutation_alloc(n);
gsl_sort_vector_index (perm, v);
gsl_permutation_inverse (rank, perm);
for (i = 0; i < n; i++)
{
double vi = gsl_vector_get(v, i);
printf("element = %d, value = %g, rank = %d\n",
i, vi, rank->data[i]);
}
gsl_permutation_free (perm);
gsl_permutation_free (rank);
}
The following example shows how to use the permutation p to print the elements of the vector v in ascending order,
gsl_sort_vector_index (p, v);
for (i = 0; i < v->size; i++)
{
double vpi = gsl_vector_get(v, p->data[i]);
printf("order = %d, value = %g\n", i, vpi);
}
The next example uses the function gsl_sort_smallest to select
the 5 smallest numbers from 100000 uniform random variates stored in an
array,
#include <gsl/gsl_rng.h>
#include <gsl/gsl_sort_double.h>
int
main (void)
{
const gsl_rng_type * T;
gsl_rng * r;
int i, k = 5, N = 100000;
double * x = malloc (N * sizeof(double));
double * small = malloc (k * sizeof(double));
gsl_rng_env_setup();
T = gsl_rng_default;
r = gsl_rng_alloc (T);
for (i = 0; i < N; i++)
{
x[i] = gsl_rng_uniform(r);
}
gsl_sort_smallest (small, k, x, 1, N);
printf("%d smallest values from %d\n", k, N);
for (i = 0; i < k; i++)
{
printf ("%d: %.18f\n", i, small[i]);
}
return 0;
}
The output lists the 5 smallest values, in ascending order,
$ ./a.out 5 smallest values from 100000 0: 0.000005466630682349 1: 0.000012384494766593 2: 0.000017581274732947 3: 0.000025131041184068 4: 0.000031369971111417
The subject of sorting is covered extensively in Knuth's Sorting and Searching,
The Heapsort algorithm is described in the following book,
The Basic Linear Algebra Subprograms (BLAS) define a set of fundamental operations on vectors and matrices which can be used to create optimized higher-level linear algebra functionality.
The library provides a low-level layer which corresponds directly to the
C-language BLAS standard, referred to here as "CBLAS", and a
higher-level interface for operations on GSL vectors and matrices.
Users who are interested in simple operations on GSL vector and matrix
objects should use the high-level layer, which is declared in the file
gsl_blas.h. This should satisfy the needs of most users. Note
that GSL matrices are implemented using dense-storage so the interface
only includes the corresponding dense-storage BLAS functions. The full
BLAS functionality for band-format and packed-format matrices is
available through the low-level CBLAS interface.
The interface for the gsl_cblas layer is specified in the file
gsl_cblas.h. This interface corresponds the BLAS Technical
Forum's draft standard for the C interface to legacy BLAS
implementations. Users who have access to other conforming CBLAS
implementations can use these in place of the version provided by the
library. Note that users who have only a Fortran BLAS library can
use a CBLAS conformant wrapper to convert it into a CBLAS
library. A reference CBLAS wrapper for legacy Fortran
implementations exists as part of the draft CBLAS standard and can
be obtained from Netlib. The complete set of CBLAS functions is
listed in an appendix (see section GSL CBLAS Library).
There are three levels of BLAS operations,
Each routine has a name which specifies the operation, the type of matrices involved and their precisions. Some of the most common operations and their names are given below,
The type of matrices are,
Each operation is defined for four precisions,
Thus, for example, the name SGEMM stands for "single-precision general matrix-matrix multiply" and ZGEMM stands for "double-precision complex matrix-matrix multiply".
GSL provides dense vector and matrix objects, based on the relevant
built-in types. The library provides an interface to the BLAS
operations which apply to these objects. The interface to this
functionality is given in the file gsl_blas.h.
The variables a and b are overwritten by the routine.
CblasNoTrans,
CblasTrans, CblasConjTrans.
CblasNoTrans, CblasTrans, CblasConjTrans. When
Uplo is CblasUpper then the upper triangle of A is
used, and when Uplo is CblasLower then the lower triangle
of A is used. If Diag is CblasNonUnit then the
diagonal of the matrix is used, but if Diag is CblasUnit
then the diagonal elements of the matrix A are taken as unity and
are not referenced.
CblasNoTrans, CblasTrans, CblasConjTrans. When
Uplo is CblasUpper then the upper triangle of A is
used, and when Uplo is CblasLower then the lower triangle
of A is used. If Diag is CblasNonUnit then the
diagonal of the matrix is used, but if Diag is CblasUnit
then the diagonal elements of the matrix A are taken as unity and
are not referenced.
CblasUpper then the upper triangle
and diagonal of A are used, and when Uplo is
CblasLower then the lower triangle and diagonal of A are
used.
CblasUpper then the upper triangle
and diagonal of A are used, and when Uplo is
CblasLower then the lower triangle and diagonal of A are
used. The imaginary elements of the diagonal are automatically assumed
to be zero and are not referenced.
CblasUpper then the upper triangle and diagonal of
A are used, and when Uplo is CblasLower then the
lower triangle and diagonal of A are used.
CblasUpper then the upper triangle and diagonal of
A are used, and when Uplo is CblasLower then the
lower triangle and diagonal of A are used. The imaginary elements
of the diagonal are automatically set to zero.
CblasUpper then the upper triangle
and diagonal of A are used, and when Uplo is
CblasLower then the lower triangle and diagonal of A are
used.
CblasUpper then the upper triangle
and diagonal of A are used, and when Uplo is
CblasLower then the lower triangle and diagonal of A are
used. The imaginary elements of the diagonal are automatically set to zero.
CblasNoTrans, CblasTrans,
CblasConjTrans and similarly for the parameter TransB.
CblasLeft and C =
\alpha B A + \beta C for Side is CblasRight, where the
matrix A is symmetric. When Uplo is CblasUpper then
the upper triangle and diagonal of A are used, and when Uplo
is CblasLower then the lower triangle and diagonal of A are
used.
CblasLeft and C =
\alpha B A + \beta C for Side is CblasRight, where the
matrix A is hermitian. When Uplo is CblasUpper then
the upper triangle and diagonal of A are used, and when Uplo
is CblasLower then the lower triangle and diagonal of A are
used. The imaginary elements of the diagonal are automatically set to
zero.
CblasLeft and B = \alpha B op(A) for
Side is CblasRight. The matrix A is triangular and
op(A) = A, A^T, A^H for TransA =
CblasNoTrans, CblasTrans, CblasConjTrans When
Uplo is CblasUpper then the upper triangle of A is
used, and when Uplo is CblasLower then the lower triangle
of A is used. If Diag is CblasNonUnit then the
diagonal of A is used, but if Diag is CblasUnit then
the diagonal elements of the matrix A are taken as unity and are
not referenced.
CblasLeft and B = \alpha B op(inv(A)) for
Side is CblasRight. The matrix A is triangular and
op(A) = A, A^T, A^H for TransA =
CblasNoTrans, CblasTrans, CblasConjTrans When
Uplo is CblasUpper then the upper triangle of A is
used, and when Uplo is CblasLower then the lower triangle
of A is used. If Diag is CblasNonUnit then the
diagonal of A is used, but if Diag is CblasUnit then
the diagonal elements of the matrix A are taken as unity and are
not referenced.
CblasNoTrans and C = \alpha A^T A + \beta C when
Trans is CblasTrans. Since the matrix C is symmetric
only its upper half or lower half need to be stored. When Uplo is
CblasUpper then the upper triangle and diagonal of C are
used, and when Uplo is CblasLower then the lower triangle
and diagonal of C are used.
CblasNoTrans and C = \alpha A^H A + \beta C when
Trans is CblasTrans. Since the matrix C is hermitian
only its upper half or lower half need to be stored. When Uplo is
CblasUpper then the upper triangle and diagonal of C are
used, and when Uplo is CblasLower then the lower triangle
and diagonal of C are used. The imaginary elements of the
diagonal are automatically set to zero.
CblasNoTrans and C = \alpha A^T B + \alpha B^T A + \beta C when
Trans is CblasTrans. Since the matrix C is symmetric
only its upper half or lower half need to be stored. When Uplo is
CblasUpper then the upper triangle and diagonal of C are
used, and when Uplo is CblasLower then the lower triangle
and diagonal of C are used.
CblasNoTrans and C = \alpha A^H B + \alpha^* B^H A + \beta C when
Trans is CblasTrans. Since the matrix C is hermitian
only its upper half or lower half need to be stored. When Uplo is
CblasUpper then the upper triangle and diagonal of C are
used, and when Uplo is CblasLower then the lower triangle
and diagonal of C are used. The imaginary elements of the
diagonal are automatically set to zero.
The following program computes the product of two matrices using the Level-3 BLAS function DGEMM,
The matrices are stored in row major order, according to the C convention for arrays.
#include <stdio.h>
#include <gsl/gsl_blas.h>
int
main (void)
{
double a[] = { 0.11, 0.12, 0.13,
0.21, 0.22, 0.23 };
double b[] = { 1011, 1012,
1021, 1022,
1031, 1032 };
double c[] = { 0.00, 0.00,
0.00, 0.00 };
gsl_matrix_view A = gsl_matrix_view_array(a, 2, 3);
gsl_matrix_view B = gsl_matrix_view_array(b, 3, 2);
gsl_matrix_view C = gsl_matrix_view_array(c, 2, 2);
/* Compute C = A B */
gsl_blas_dgemm (CblasNoTrans, CblasNoTrans,
1.0, &A.matrix, &B.matrix,
0.0, &C.matrix);
printf("[ %g, %g\n", c[0], c[1]);
printf(" %g, %g ]\n", c[2], c[3]);
return 0;
}
Here is the output from the program,
$ ./a.out [ 367.76, 368.12 674.06, 674.72 ]
Information on the BLAS standards, including both the legacy and draft interface standards, is available online from the BLAS Homepage and BLAS Technical Forum web-site.
The following papers contain the specifications for Level 1, Level 2 and Level 3 BLAS.
Postscript versions of the latter two papers are available from http://www.netlib.org/blas/. A CBLAS wrapper for Fortran BLAS libraries is available from the same location.
This chapter describes functions for solving linear systems. The
library provides simple linear algebra operations which operate directly
on the gsl_vector and gsl_matrix objects. These are
intended for use with "small" systems where simple algorithms are
acceptable.
Anyone interested in large systems will want to use the sophisticated routines found in LAPACK. The Fortran version of LAPACK is recommended as the standard package for linear algebra. It supports blocked algorithms, specialized data representations and other optimizations.
The functions described in this chapter are declared in the header file `gsl_linalg.h'.
A general square matrix A has an LU decomposition into upper and lower triangular matrices,
where P is a permutation matrix, L is unit lower triangular matrix and U is upper triangular matrix. For square matrices this decomposition can be used to convert the linear system A x = b into a pair of triangular systems (L y = P b, U x = y), which can be solved by forward and back-substitution.
The permutation matrix P is encoded in the permutation p. The j-th column of the matrix P is given by the k-th column of the identity matrix, where k = p_j the j-th element of the permutation vector. The sign of the permutation is given by signum. It has the value (-1)^n, where n is the number of interchanges in the permutation.
The algorithm used in the decomposition is Gaussian Elimination with partial pivoting (Golub & Van Loan, Matrix Computations, Algorithm 3.4.1).
gsl_linalg_LU_decomp or gsl_linalg_complex_LU_decomp.
A general rectangular M-by-N matrix A has a QR decomposition into the product of an orthogonal M-by-M square matrix Q (where Q^T Q = I) and an M-by-N right-triangular matrix R,
This decomposition can be used to convert the linear system A x = b into the triangular system R x = Q^T b, which can be solved by back-substitution. Another use of the QR decomposition is to compute an orthonormal basis for a set of vectors. The first N columns of Q form an orthonormal basis for the range of A, ran(A), when A has full column rank.
The algorithm used to perform the decomposition is Householder QR (Golub & Van Loan, Matrix Computations, Algorithm 5.2.1).
gsl_linalg_QR_decomp.
gsl_linalg_QR_decomp. On input x should contain the
right-hand side b, which is replaced by the solution on output.
gsl_linalg_QR_decomp. The solution is returned in x. The
residual is computed as a by-product and stored in residual.
gsl_linalg_QR_QTvec.
gsl_linalg_QR_QTvec.
The QR decomposition can be extended to the rank deficient case by introducing a column permutation P,
The first r columns of this Q form an orthonormal basis for the range of A for a matrix with column rank r. This decomposition can also be used to convert the linear system A x = b into the triangular system R y = Q^T b, x = P y, which can be solved by back-substitution and permutation. We denote the QR decomposition with column pivoting by QRP^T since A = Q R P^T.
The algorithm used to perform the decomposition is Householder QR with column pivoting (Golub & Van Loan, Matrix Computations, Algorithm 5.4.1).
gsl_linalg_QRPT_decomp.
A general rectangular M-by-N matrix A has a singular value decomposition (SVD) into the product of an M-by-N orthogonal matrix U, an N-by-N diagonal matrix of singular values S and the transpose of an N-by-N orthogonal square matrix V,
The singular values \sigma_i = S_{ii} are all non-negative and are generally chosen to form a non-increasing sequence \sigma_1 >= \sigma_2 >= ... >= \sigma_N >= 0.
The singular value decomposition of a matrix has many practical uses. The condition number of the matrix is given by the ratio of the largest singular value to the smallest singular value. The presence of a zero singular value indicates that the matrix is singular. The number of non-zero singular values indicates the rank of the matrix. In practice singular value decomposition of a rank-deficient matrix will not produce exact zeroes for singular values, due to finite numerical precision. Small singular values should be edited by choosing a suitable tolerance.
This routine uses the Golub-Reinsch SVD algorithm.
gsl_linalg_SV_decomp.
Only non-zero singular values are used in computing the solution. The parts of the solution corresponding to singular values of zero are ignored. Other singular values can be edited out by setting them to zero before calling this function.
In the over-determined case where A has more rows than columns the system is solved in the least squares sense, returning the solution x which minimizes ||A x - b||_2.
A symmetric, positive definite square matrix A has a Cholesky decomposition into a product of a lower triangular matrix L and its transpose L^T,
This is sometimes referred to as taking the square-root of a matrix. The Cholesky decomposition can only be carried out when all the eigenvalues of the matrix are positive. This decomposition can be used to convert the linear system A x = b into a pair of triangular systems (L y = b, L^T x = y), which can be solved by forward and back-substitution.
GSL_EDOM.
gsl_linalg_cholesky_decomp.
gsl_linalg_cholesky_decomp. On input x should contain
the right-hand side b, which is replaced by the solution on
output.
A symmetric matrix A can be factorized by similarity transformations into the form,
where Q is an orthogonal matrix and T is a symmetric tridiagonal matrix.
gsl_linalg_symmtd_decomp into
the orthogonal matrix Q, the vector of diagonal elements diag
and the vector of subdiagonal elements subdiag.
gsl_linalg_symmtd_decomp into the vectors diag and subdiag.
A hermitian matrix A can be factorized by similarity transformations into the form,
where U is an unitary matrix and T is a real symmetric tridiagonal matrix.
gsl_linalg_hermtd_decomp into the
unitary matrix U, the real vector of diagonal elements diag and
the real vector of subdiagonal elements subdiag.
gsl_linalg_hermtd_decomp into the real vectors diag and
subdiag.
A general matrix A can be factorized by similarity transformations into the form,
where U and V are orthogonal matrices and B is a N-by-N bidiagonal matrix with non-zero entries only on the diagonal and superdiagonal. The size of U is M-by-N and the size of V is N-by-N.
gsl_linalg_bidiag_decomp, (A, tau_U, tau_V)
into the separate orthogonal matrices U, V and the diagonal
vector diag and superdiagonal superdiag.
gsl_linalg_bidiag_decomp, (A, tau_U, tau_V)
into the separate orthogonal matrices U, V and the diagonal
vector diag and superdiagonal superdiag. The matrix U
is stored in-place in A.
gsl_linalg_bidiag_decomp, into
the diagonal vector diag and superdiagonal vector superdiag.
The following program solves the linear system A x = b. The system to be solved is,
and the solution is found using LU decomposition of the matrix A.
#include <stdio.h>
#include <gsl/gsl_linalg.h>
int
main (void)
{
double a_data[] = { 0.18, 0.60, 0.57, 0.96,
0.41, 0.24, 0.99, 0.58,
0.14, 0.30, 0.97, 0.66,
0.51, 0.13, 0.19, 0.85 };
double b_data[] = { 1.0, 2.0, 3.0, 4.0 };
gsl_matrix_view m
= gsl_matrix_view_array(a_data, 4, 4);
gsl_vector_view b
= gsl_vector_view_array(b_data, 4);
gsl_vector *x = gsl_vector_alloc (4);
int s;
gsl_permutation * p = gsl_permutation_alloc (4);
gsl_linalg_LU_decomp (&m.matrix, p, &s);
gsl_linalg_LU_solve (&m.matrix, p, &b.vector, x);
printf ("x = \n");
gsl_vector_fprintf(stdout, x, "%g");
gsl_permutation_free (p);
return 0;
}
Here is the output from the program,
x = -4.05205 -12.6056 1.66091 8.69377
This can be verified by multiplying the solution x by the original matrix A using GNU OCTAVE,
octave> A = [ 0.18, 0.60, 0.57, 0.96;
0.41, 0.24, 0.99, 0.58;
0.14, 0.30, 0.97, 0.66;
0.51, 0.13, 0.19, 0.85 ];
octave> x = [ -4.05205; -12.6056; 1.66091; 8.69377];
octave> A * x
ans =
1.0000
2.0000
3.0000
4.0000
This reproduces the original right-hand side vector, b, in accordance with the equation A x = b.
Further information on the algorithms described in this section can be found in the following book,
The LAPACK library is described in,
The LAPACK source code can be found at the website above, along with an online copy of the users guide.
The Modified Golub-Reinsch algorithm is described in the following paper,
The Jacobi algorithm for singular value decomposition is described in the following papers,
lawns or
lawnspdf directories.
This chapter describes functions for computing eigenvalues and eigenvectors of matrices. There are routines for real symmetric and complex hermitian matrices, and eigenvalues can be computed with or without eigenvectors. The algorithms used are symmetric bidiagonalization followed by QR reduction.
These routines are intended for "small" systems where simple algorithms are acceptable. Anyone interested finding eigenvalues and eigenvectors of large matrices will want to use the sophisticated routines found in LAPACK. The Fortran version of LAPACK is recommended as the standard package for linear algebra.
The functions described in this chapter are declared in the header file `gsl_eigen.h'.
GSL_EIGEN_SORT_VAL_ASC
GSL_EIGEN_SORT_VAL_DESC
GSL_EIGEN_SORT_ABS_ASC
GSL_EIGEN_SORT_ABS_DESC
The following program computes the eigenvalues and eigenvectors of the 4-th order Hilbert matrix, H(i,j) = 1/(i + j + 1).
#include <stdio.h>
#include <gsl/gsl_math.h>
#include <gsl/gsl_eigen.h>
int
main (void)
{
double data[] = { 1.0 , 1/2.0, 1/3.0, 1/4.0,
1/2.0, 1/3.0, 1/4.0, 1/5.0,
1/3.0, 1/4.0, 1/5.0, 1/6.0,
1/4.0, 1/5.0, 1/6.0, 1/7.0 };
gsl_matrix_view m
= gsl_matrix_view_array(data, 4, 4);
gsl_vector *eval = gsl_vector_alloc (4);
gsl_matrix *evec = gsl_matrix_alloc (4, 4);
gsl_eigen_symmv_workspace * w =
gsl_eigen_symmv_alloc (4);
gsl_eigen_symmv (&m.matrix, eval, evec, w);
gsl_eigen_symmv_free(w);
gsl_eigen_symmv_sort (eval, evec,
GSL_EIGEN_SORT_ABS_ASC);
{
int i;
for (i = 0; i < 4; i++)
{
double eval_i
= gsl_vector_get(eval, i);
gsl_vector_view evec_i
= gsl_matrix_column(evec, i);
printf("eigenvalue = %g\n", eval_i);
printf("eigenvector = \n");
gsl_vector_fprintf(stdout,
&evec_i.vector, "%g");
}
}
return 0;
}
Here is the beginning of the output from the program,
$ ./a.out eigenvalue = 9.67023e-05 eigenvector = -0.0291933 0.328712 -0.791411 0.514553 ...
This can be compared with the corresponding output from GNU OCTAVE,
octave> [v,d] = eig(hilb(4)); octave> diag(d) ans = 9.6702e-05 6.7383e-03 1.6914e-01 1.5002e+00 octave> v v = 0.029193 0.179186 -0.582076 0.792608 -0.328712 -0.741918 0.370502 0.451923 0.791411 0.100228 0.509579 0.322416 -0.514553 0.638283 0.514048 0.252161
Note that the eigenvectors can differ by a change of sign, since the sign of an eigenvector is arbitrary.
Further information on the algorithms described in this section can be found in the following book,
The LAPACK library is described in,
The LAPACK source code can be found at the website above along with an online copy of the users guide.
This chapter describes functions for performing Fast Fourier Transforms (FFTs). The library includes radix-2 routines (for lengths which are a power of two) and mixed-radix routines (which work for any length). For efficiency there are separate versions of the routines for real data and for complex data. The mixed-radix routines are a reimplementation of the FFTPACK library by Paul Swarztrauber. Fortran code for FFTPACK is available on Netlib (FFTPACK also includes some routines for sine and cosine transforms but these are currently not available in GSL). For details and derivations of the underlying algorithms consult the document GSL FFT Algorithms (see section References and Further Reading)
Fast Fourier Transforms are efficient algorithms for calculating the discrete fourier transform (DFT),
The DFT usually arises as an approximation to the continuous fourier transform when functions are sampled at discrete intervals in space or time. The naive evaluation of the discrete fourier transform is a matrix-vector multiplication W\vec{z}. A general matrix-vector multiplication takes O(N^2) operations for N data-points. Fast fourier transform algorithms use a divide-and-conquer strategy to factorize the matrix W into smaller sub-matrices, corresponding to the integer factors of the length N. If N can be factorized into a product of integers f_1 f_2 ... f_n then the DFT can be computed in O(N \sum f_i) operations. For a radix-2 FFT this gives an operation count of O(N \log_2 N).
All the FFT functions offer three types of transform: forwards, inverse and backwards, based on the same mathematical definitions. The definition of the forward fourier transform, x = FFT(z), is,
and the definition of the inverse fourier transform, x = IFFT(z), is,
The factor of 1/N makes this a true inverse. For example, a call
to gsl_fft_complex_forward followed by a call to
gsl_fft_complex_inverse should return the original data (within
numerical errors).
In general there are two possible choices for the sign of the exponential in the transform/ inverse-transform pair. GSL follows the same convention as FFTPACK, using a negative exponential for the forward transform. The advantage of this convention is that the inverse transform recreates the original function with simple fourier synthesis. Numerical Recipes uses the opposite convention, a positive exponential in the forward transform.
The backwards FFT is simply our terminology for an unscaled version of the inverse FFT,
When the overall scale of the result is unimportant it is often convenient to use the backwards FFT instead of the inverse to save unnecessary divisions.
The inputs and outputs for the complex FFT routines are packed arrays of floating point numbers. In a packed array the real and imaginary parts of each complex number are placed in alternate neighboring elements. For example, the following definition of a packed array of length 6,
gsl_complex_packed_array data[6];
can be used to hold an array of three complex numbers, z[3], in
the following way,
data[0] = Re(z[0]) data[1] = Im(z[0]) data[2] = Re(z[1]) data[3] = Im(z[1]) data[4] = Re(z[2]) data[5] = Im(z[2])
A stride parameter allows the user to perform transforms on the
elements z[stride*i] instead of z[i]. A stride greater
than 1 can be used to take an in-place FFT of the column of a matrix. A
stride of 1 accesses the array without any additional spacing between
elements.
The array indices have the same ordering as those in the definition of the DFT -- i.e. there are no index transformations or permutations of the data.
For physical applications it is important to remember that the index appearing in the DFT does not correspond directly to a physical frequency. If the time-step of the DFT is \Delta then the frequency-domain includes both positive and negative frequencies, ranging from -1/(2\Delta) through 0 to +1/(2\Delta). The positive frequencies are stored from the beginning of the array up to the middle, and the negative frequencies are stored backwards from the end of the array.
Here is a table which shows the layout of the array data, and the correspondence between the time-domain data z, and the frequency-domain data x.
index z x = FFT(z)
0 z(t = 0) x(f = 0)
1 z(t = 1) x(f = 1/(N Delta))
2 z(t = 2) x(f = 2/(N Delta))
. ........ ..................
N/2 z(t = N/2) x(f = +1/(2 Delta),
-1/(2 Delta))
. ........ ..................
N-3 z(t = N-3) x(f = -3/(N Delta))
N-2 z(t = N-2) x(f = -2/(N Delta))
N-1 z(t = N-1) x(f = -1/(N Delta))
When N is even the location N/2 contains the most positive and negative frequencies +1/(2 \Delta), -1/(2 \Delta)) which are equivalent. If N is odd then general structure of the table above still applies, but N/2 does not appear.
The radix-2 algorithms described in this section are simple and compact, although not necessarily the most efficient. They use the Cooley-Tukey algorithm to compute in-place complex FFTs for lengths which are a power of 2 -- no additional storage is required. The corresponding self-sorting mixed-radix routines offer better performance at the expense of requiring additional working space.
All these functions are declared in the header file `gsl_fft_complex.h'.
These functions compute forward, backward and inverse FFTs of length n with stride stride, on the packed complex array data using an in-place radix-2 decimation-in-time algorithm. The length of the transform n is restricted to powers of two.
The functions return a value of GSL_SUCCESS if no errors were
detected, or GSL_EDOM if the length of the data n is not a
power of two.
These are decimation-in-frequency versions of the radix-2 FFT functions.
Here is an example program which computes the FFT of a short pulse in a sample of length 128. To make the resulting fourier transform real the pulse is defined for equal positive and negative times (-10 ... 10), where the negative times wrap around the end of the array.
#include <stdio.h>
#include <math.h>
#include <gsl/gsl_errno.h>
#include <gsl/gsl_fft_complex.h>
#define REAL(z,i) ((z)[2*(i)])
#define IMAG(z,i) ((z)[2*(i)+1])
int
main (void)
{
int i;
double data[2*128];
for (i = 0; i < 128; i++)
{
REAL(data,i) = 0.0;
IMAG(data,i) = 0.0;
}
REAL(data,0) = 1.0;
for (i = 1; i <= 10; i++)
{
REAL(data,i) = REAL(data,128-i) = 1.0;
}
for (i = 0; i < 128; i++)
{
printf ("%d %e %e\n", i,
REAL(data,i), IMAG(data,i));
}
printf ("\n");
gsl_fft_complex_radix2_forward (data, 1, 128);
for (i = 0; i < 128; i++)
{
printf ("%d %e %e\n", i,
REAL(data,i)/sqrt(128),
IMAG(data,i)/sqrt(128));
}
return 0;
}
Note that we have assumed that the program is using the default error
handler (which calls abort for any errors). If you are not using
a safe error handler you would need to check the return status of
gsl_fft_complex_radix2_forward.
The transformed data is rescaled by 1/\sqrt N so that it fits on the same plot as the input. Only the real part is shown, by the choice of the input data the imaginary part is zero. Allowing for the wrap-around of negative times at t=128, and working in units of k/N, the DFT approximates the continuum fourier transform, giving a modulated \sin function.
A pulse and its discrete fourier transform, output from the example program.
This section describes mixed-radix FFT algorithms for complex data. The mixed-radix functions work for FFTs of any length. They are a reimplementation of the Fortran FFTPACK library by Paul Swarztrauber. The theory is explained in the review article Self-sorting Mixed-radix FFTs by Clive Temperton. The routines here use the same indexing scheme and basic algorithms as FFTPACK.
The mixed-radix algorithm is based on sub-transform modules -- highly optimized small length FFTs which are combined to create larger FFTs. There are efficient modules for factors of 2, 3, 4, 5, 6 and 7. The modules for the composite factors of 4 and 6 are faster than combining the modules for 2*2 and 2*3.
For factors which are not implemented as modules there is a fall-back to a general length-n module which uses Singleton's method for efficiently computing a DFT. This module is O(n^2), and slower than a dedicated module would be but works for any length n. Of course, lengths which use the general length-n module will still be factorized as much as possible. For example, a length of 143 will be factorized into 11*13. Large prime factors are the worst case scenario, e.g. as found in n=2*3*99991, and should be avoided because their O(n^2) scaling will dominate the run-time (consult the document GSL FFT Algorithms included in the GSL distribution if you encounter this problem).
The mixed-radix initialization function gsl_fft_complex_wavetable_alloc
returns the list of factors chosen by the library for a given length
N. It can be used to check how well the length has been
factorized, and estimate the run-time. To a first approximation the
run-time scales as N \sum f_i, where the f_i are the
factors of N. For programs under user control you may wish to
issue a warning that the transform will be slow when the length is
poorly factorized. If you frequently encounter data lengths which
cannot be factorized using the existing small-prime modules consult
GSL FFT Algorithms for details on adding support for other
factors.
All these functions are declared in the header file `gsl_fft_complex.h'.
gsl_fft_complex_wavetable if no errors were detected, and a null
pointer in the case of error. The length n is factorized into a
product of subtransforms, and the factors and their trigonometric
coefficients are stored in the wavetable. The trigonometric coefficients
are computed using direct calls to sin and cos, for
accuracy. Recursion relations could be used to compute the lookup table
faster, but if an application performs many FFTs of the same length then
this computation is a one-off overhead which does not affect the final
throughput.
The wavetable structure can be used repeatedly for any transform of the same length. The table is not modified by calls to any of the other FFT functions. The same wavetable can be used for both forward and backward (or inverse) transforms of a given length.
gsl_fft_complex_wavetable structure
which contains internal parameters for the FFT. It is not necessary to
set any of the components directly but it can sometimes be useful to
examine them. For example, the chosen factorization of the FFT length
is given and can be used to provide an estimate of the run-time or
numerical error.
The wavetable structure is declared in the header file `gsl_fft_complex.h'.
size_t n
size_t nf
n was decomposed into.
size_t factor[64]
nf elements are
used.
gsl_complex * trig
n complex elements.
gsl_complex * twiddle[64]
trig, giving the twiddle
factors for each pass.
The mixed radix algorithms require an additional working space to hold the intermediate steps of the transform.
The following functions compute the transform,
These functions compute forward, backward and inverse FFTs of length n with stride stride, on the packed complex array data, using a mixed radix decimation-in-frequency algorithm. There is no restriction on the length n. Efficient modules are provided for subtransforms of length 2, 3, 4, 5, 6 and 7. Any remaining factors are computed with a slow, O(n^2), general-n module. The caller must supply a wavetable containing the trigonometric lookup tables and a workspace work.
The functions return a value of 0 if no errors were detected. The
following gsl_errno conditions are defined for these functions:
GSL_EDOM
GSL_EINVAL
Here is an example program which computes the FFT of a short pulse in a sample of length 630 (=2*3*3*5*7) using the mixed-radix algorithm.
#include <stdio.h>
#include <math.h>
#include <gsl/gsl_errno.h>
#include <gsl/gsl_fft_complex.h>
#define REAL(z,i) ((z)[2*(i)])
#define IMAG(z,i) ((z)[2*(i)+1])
int
main (void)
{
int i;
const int n = 630;
double data[2*n];
gsl_fft_complex_wavetable * wavetable;
gsl_fft_complex_workspace * workspace;
for (i = 0; i < n; i++)
{
REAL(data,i) = 0.0;
IMAG(data,i) = 0.0;
}
data[0].real = 1.0;
for (i = 1; i <= 10; i++)
{
REAL(data,i) = REAL(data,n-i) = 1.0;
}
for (i = 0; i < n; i++)
{
printf ("%d: %e %e\n", i, REAL(data,i),
IMAG(data,i));
}
printf ("\n");
wavetable = gsl_fft_complex_wavetable_alloc (n);
workspace = gsl_fft_complex_workspace_alloc (n);
for (i = 0; i < wavetable->nf; i++)
{
printf("# factor %d: %d\n", i,
wavetable->factor[i]);
}
gsl_fft_complex_forward (data, 1, n,
wavetable, workspace);
for (i = 0; i < n; i++)
{
printf ("%d: %e %e\n", i, REAL(data,i),
IMAG(data,i));
}
gsl_fft_complex_wavetable_free (wavetable);
gsl_fft_complex_workspace_free (workspace);
return 0;
}
Note that we have assumed that the program is using the default
gsl error handler (which calls abort for any errors). If
you are not using a safe error handler you would need to check the
return status of all the gsl routines.
The functions for real data are similar to those for complex data. However, there is an important difference between forward and inverse transforms. The fourier transform of a real sequence is not real. It is a complex sequence with a special symmetry:
A sequence with this symmetry is called conjugate-complex or
half-complex. This different structure requires different
storage layouts for the forward transform (from real to half-complex)
and inverse transform (from half-complex back to real). As a
consequence the routines are divided into two sets: functions in
gsl_fft_real which operate on real sequences and functions in
gsl_fft_halfcomplex which operate on half-complex sequences.
Functions in gsl_fft_real compute the frequency coefficients of a
real sequence. The half-complex coefficients c of a real sequence
x are given by fourier analysis,
Functions in gsl_fft_halfcomplex compute inverse or backwards
transforms. They reconstruct real sequences by fourier synthesis from
their half-complex frequency coefficients, c,
The symmetry of the half-complex sequence implies that only half of the complex numbers in the output need to be stored. The remaining half can be reconstructed using the half-complex symmetry condition. (This works for all lengths, even and odd. When the length is even the middle value, where k=N/2, is also real). Thus only N real numbers are required to store the half-complex sequence, and the transform of a real sequence can be stored in the same size array as the original data.
The precise storage arrangements depend on the algorithm, and are different for radix-2 and mixed-radix routines. The radix-2 function operates in-place, which constrain the locations where each element can be stored. The restriction forces real and imaginary parts to be stored far apart. The mixed-radix algorithm does not have this restriction, and it stores the real and imaginary parts of a given term in neighboring locations. This is desirable for better locality of memory accesses.
This section describes radix-2 FFT algorithms for real data. They use the Cooley-Tukey algorithm to compute in-place FFTs for lengths which are a power of 2.
The radix-2 FFT functions for real data are declared in the header files `gsl_fft_real.h'
This function computes an in-place radix-2 FFT of length n and stride stride on the real array data. The output is a half-complex sequence, which is stored in-place. The arrangement of the half-complex terms uses the following scheme: for k < N/2 the real part of the k-th term is stored in location k, and the corresponding imaginary part is stored in location N-k. Terms with k > N/2 can be reconstructed using the symmetry z_k = z^*_{N-k}. The terms for k=0 and k=N/2 are both purely real, and count as a special case. Their real parts are stored in locations 0 and N/2 respectively, while their imaginary parts which are zero are not stored.
The following table shows the correspondence between the output data and the equivalent results obtained by considering the input data as a complex sequence with zero imaginary part,
complex[0].real = data[0] complex[0].imag = 0 complex[1].real = data[1] complex[1].imag = data[N-1] ............... ................ complex[k].real = data[k] complex[k].imag = data[N-k] ............... ................ complex[N/2].real = data[N/2] complex[N/2].real = 0 ............... ................ complex[k'].real = data[k] k' = N - k complex[k'].imag = -data[N-k] ............... ................ complex[N-1].real = data[1] complex[N-1].imag = -data[N-1]
The radix-2 FFT functions for halfcomplex data are declared in the header file `gsl_fft_halfcomplex.h'.
These functions compute the inverse or backwards in-place radix-2 FFT of
length n and stride stride on the half-complex sequence
data stored according the output scheme used by
gsl_fft_real_radix2. The result is a real array stored in natural
order.
This section describes mixed-radix FFT algorithms for real data. The mixed-radix functions work for FFTs of any length. They are a reimplementation of the real-FFT routines in the Fortran FFTPACK library by Paul Swarztrauber. The theory behind the algorithm is explained in the article Fast Mixed-Radix Real Fourier Transforms by Clive Temperton. The routines here use the same indexing scheme and basic algorithms as FFTPACK.
The functions use the FFTPACK storage convention for half-complex sequences. In this convention the half-complex transform of a real sequence is stored with frequencies in increasing order, starting at zero, with the real and imaginary parts of each frequency in neighboring locations. When a value is known to be real the imaginary part is not stored. The imaginary part of the zero-frequency component is never stored. It is known to be zero (since the zero frequency component is simply the sum of the input data (all real)). For a sequence of even length the imaginary part of the frequency n/2 is not stored either, since the symmetry z_k = z_{N-k}^* implies that this is purely real too.
The storage scheme is best shown by some examples. The table below
shows the output for an odd-length sequence, n=5. The two columns
give the correspondence between the 5 values in the half-complex
sequence returned by gsl_fft_real_transform, halfcomplex[] and the
values complex[] that would be returned if the same real input
sequence were passed to gsl_fft_complex_backward as a complex
sequence (with imaginary parts set to 0),
complex[0].real = halfcomplex[0] complex[0].imag = 0 complex[1].real = halfcomplex[1] complex[1].imag = halfcomplex[2] complex[2].real = halfcomplex[3] complex[2].imag = halfcomplex[4] complex[3].real = halfcomplex[3] complex[3].imag = -halfcomplex[4] complex[4].real = halfcomplex[1] complex[4].imag = -halfcomplex[2]
The upper elements of the complex array, complex[3] and
complex[4] are filled in using the symmetry condition. The
imaginary part of the zero-frequency term complex[0].imag is
known to be zero by the symmetry.
The next table shows the output for an even-length sequence, n=5 In the even case there are two values which are purely real,
complex[0].real = halfcomplex[0] complex[0].imag = 0 complex[1].real = halfcomplex[1] complex[1].imag = halfcomplex[2] complex[2].real = halfcomplex[3] complex[2].imag = halfcomplex[4] complex[3].real = halfcomplex[5] complex[3].imag = 0 complex[4].real = halfcomplex[3] complex[4].imag = -halfcomplex[4] complex[5].real = halfcomplex[1] complex[5].imag = -halfcomplex[2]
The upper elements of the complex array, complex[4] and
complex[5] are filled in using the symmetry condition. Both
complex[0].imag and complex[3].imag are known to be zero.
All these functions are declared in the header files `gsl_fft_real.h' and `gsl_fft_halfcomplex.h'.
sin and cos, for accuracy.
Recursion relations could be used to compute the lookup table faster,
but if an application performs many FFTs of the same length then
computing the wavetable is a one-off overhead which does not affect the
final throughput.
The wavetable structure can be used repeatedly for any transform of the same length. The table is not modified by calls to any of the other FFT functions. The appropriate type of wavetable must be used for forward real or inverse half-complex transforms.
gsl_fft_real_transform data is an array of
time-ordered real data. For gsl_fft_halfcomplex_transform
data contains fourier coefficients in the half-complex ordering
described above. There is no restriction on the length n.
Efficient modules are provided for subtransforms of length 2, 3, 4 and
5. Any remaining factors are computed with a slow, O(n^2),
general-n module. The caller must supply a wavetable containing
trigonometric lookup tables and a workspace work.
This function converts a single real array, real_coefficient into
an equivalent complex array, complex_coefficient, (with imaginary
part set to zero), suitable for gsl_fft_complex routines. The
algorithm for the conversion is simply,
for (i = 0; i < n; i++)
{
complex_coefficient[i].real
= real_coefficient[i];
complex_coefficient[i].imag
= 0.0;
}
This function converts halfcomplex_coefficient, an array of
half-complex coefficients as returned by gsl_fft_real_transform, into an
ordinary complex array, complex_coefficient. It fills in the
complex array using the symmetry
z_k = z_{N-k}^*
to reconstruct the redundant elements. The algorithm for the conversion
is,
complex_coefficient[0].real
= halfcomplex_coefficient[0];
complex_coefficient[0].imag
= 0.0;
for (i = 1; i < n - i; i++)
{
double hc_real
= halfcomplex_coefficient[2 * i - 1];
double hc_imag
= halfcomplex_coefficient[2 * i];
complex_coefficient[i].real = hc_real;
complex_coefficient[i].imag = hc_imag;
complex_coefficient[n - i].real = hc_real;
complex_coefficient[n - i].imag = -hc_imag;
}
if (i == n - i)
{
complex_coefficient[i].real
= halfcomplex_coefficient[n - 1];
complex_coefficient[i].imag
= 0.0;
}
Here is an example program using gsl_fft_real_transform and
gsl_fft_halfcomplex_inverse. It generates a real signal in the
shape of a square pulse. The pulse is fourier transformed to frequency
space, and all but the lowest ten frequency components are removed from
the array of fourier coefficients returned by
gsl_fft_real_transform.
The remaining fourier coefficients are transformed back to the time-domain, to give a filtered version of the square pulse. Since fourier coefficients are stored using the half-complex symmetry both positive and negative frequencies are removed and the final filtered signal is also real.
#include <stdio.h>
#include <math.h>
#include <gsl/gsl_errno.h>
#include <gsl/gsl_fft_real.h>
#include <gsl/gsl_fft_halfcomplex.h>
int
main (void)
{
int i, n = 100;
double data[n];
gsl_fft_real_wavetable * real;
gsl_fft_halfcomplex_wavetable * hc;
gsl_fft_real_workspace * work;
for (i = 0; i < n; i++)
{
data[i] = 0.0;
}
for (i = n / 3; i < 2 * n / 3; i++)
{
data[i] = 1.0;
}
for (i = 0; i < n; i++)
{
printf ("%d: %e\n", i, data[i]);
}
printf ("\n");
work = gsl_fft_real_workspace_alloc (n);
real = gsl_fft_real_wavetable_alloc (n);
gsl_fft_real_transform (data, 1, n,
real, work);
gsl_fft_real_wavetable_free (real);
for (i = 11; i < n; i++)
{
data[i] = 0;
}
hc = gsl_fft_halfcomplex_wavetable_alloc (n);
gsl_fft_halfcomplex_inverse (data, 1, n,
hc, work);
gsl_fft_halfcomplex_wavetable_free (hc);
for (i = 0; i < n; i++)
{
printf ("%d: %e\n", i, data[i]);
}
gsl_fft_real_workspace_free (work);
return 0;
}
Low-pass filtered version of a real pulse, output from the example program.
A good starting point for learning more about the FFT is the review article Fast Fourier Transforms: A Tutorial Review and A State of the Art by Duhamel and Vetterli,
To find out about the algorithms used in the GSL routines you may want to consult the latex document GSL FFT Algorithms (it is included in GSL, as `doc/fftalgorithms.tex'). This has general information on FFTs and explicit derivations of the implementation for each routine. There are also references to the relevant literature. For convenience some of the more important references are reproduced below.
There are several introductory books on the FFT with example programs, such as The Fast Fourier Transform by Brigham and DFT/FFT and Convolution Algorithms by Burrus and Parks,
Both these introductory books cover the radix-2 FFT in some detail. The mixed-radix algorithm at the heart of the FFTPACK routines is reviewed in Clive Temperton's paper,
The derivation of FFTs for real-valued data is explained in the following two articles,
In 1979 the IEEE published a compendium of carefully-reviewed Fortran FFT programs in Programs for Digital Signal Processing. It is a useful reference for implementations of many different FFT algorithms,
For serious FFT work we recommend the use of the dedicated FFTW library by Frigo and Johnson. The FFTW library is self-optimizing -- it automatically tunes itself for each hardware platform in order to achieve maximum performance. It is available under the GNU GPL.
This chapter describes routines for performing numerical integration (quadrature) of a function in one dimension. There are routines for adaptive and non-adaptive integration of general functions, with specialised routines for specific cases. These include integration over infinite and semi-infinite ranges, singular integrals, including logarithmic singularities, computation of Cauchy principal values and oscillatory integrals. The library reimplements the algorithms used in QUADPACK, a numerical integration package written by Piessens, Doncker-Kapenga, Uberhuber and Kahaner. Fortran code for QUADPACK is available on Netlib.
The functions described in this chapter are declared in the header file `gsl_integration.h'.
Each algorithm computes an approximation to a definite integral of the form,
where w(x) is a weight function (for general integrands w(x)=1). The user provides absolute and relative error bounds (epsabs, epsrel) which specify the following accuracy requirement,
where RESULT is the numerical approximation obtained by the algorithm. The algorithms attempt to estimate the absolute error ABSERR = |RESULT - I| in such a way that the following inequality holds,
The routines will fail to converge if the error bounds are too stringent, but always return the best approximation obtained up to that stage.
The algorithms in QUADPACK use a naming convention based on the following letters,
Q- quadrature routineN- non-adaptive integratorA- adaptive integratorG- general integrand (user-defined)W- weight function with integrandS- singularities can be more readily integratedP- points of special difficulty can be suppliedI- infinite range of integrationO- oscillatory weight function, cos or sinF- Fourier integralC- Cauchy principal value
The algorithms are built on pairs of quadrature rules, a higher order rule and a lower order rule. The higher order rule is used to compute the best approximation to an integral over a small range. The difference between the results of the higher order rule and the lower order rule gives an estimate of the error in the approximation.
The algorithms for general functions (without a weight function) are based on Gauss-Kronrod rules. A Gauss-Kronrod rule begins with a classical Gaussian quadrature rule of order m. This is extended with additional points between each of the abscissae to give a higher order Kronrod rule of order 2m+1. The Kronrod rule is efficient because it reuses existing function evaluations from the Gaussian rule. The higher order Kronrod rule is used as the best approximation to the integral, and the difference between the two rules is used as an estimate of the error in the approximation.
For integrands with weight functions the algorithms use Clenshaw-Curtis quadrature rules. A Clenshaw-Curtis rule begins with an n-th order Chebyschev polynomial approximation to the integrand. This polynomial can be integrated exactly to give an approximation to the integral of the original function. The Chebyschev expansion can be extended to higher orders to improve the approximation. The presence of singularities (or other behavior) in the integrand can cause slow convergence in the Chebyschev approximation. The modified Clenshaw-Curtis rules used in QUADPACK separate out several common weight functions which cause slow convergence. These weight functions are integrated analytically against the Chebyschev polynomials to precompute modified Chebyschev moments. Combining the moments with the Chebyschev approximation to the function gives the desired integral. The use of analytic integration for the singular part of the function allows exact cancellations and substantially improves the overall convergence behavior of the integration.
The QNG algorithm is non-adaptive procedure which uses fixed Gauss-Kronrod abscissae to sample the integrand at a maximum of 87 points. It is provided for fast integration of smooth functions.
This function applies the Gauss-Kronrod 10-point, 21-point, 43-point and 87-point integration rules in succession until an estimate of the integral of f over (a,b) is achieved within the desired absolute and relative error limits, epsabs and epsrel. The function returns the final approximation, result, an estimate of the absolute error, abserr and the number of function evaluations used, neval. The Gauss-Kronrod rules are designed in such a way that each rule uses all the results of its predecessors, in order to minimize the total number of function evaluations.
The QAG algorithm is a simple adaptive integration procedure. The
integration region is divided into subintervals, and on each iteration
the subinterval with the largest estimated error is bisected. This
reduces the overall error rapidly, as the subintervals become
concentrated around local difficulties in the integrand. These
subintervals are managed by a gsl_integration_workspace struct,
which handles the memory for the subinterval ranges, results and error
estimates.
This function applies an integration rule adaptively until an estimate of the integral of f over (a,b) is achieved within the desired absolute and relative error limits, epsabs and epsrel. The function returns the final approximation, result, and an estimate of the absolute error, abserr. The integration rule is determined by the value of key, which should be chosen from the following symbolic names,
GSL_INTEG_GAUSS15 (key = 1) GSL_INTEG_GAUSS21 (key = 2) GSL_INTEG_GAUSS31 (key = 3) GSL_INTEG_GAUSS41 (key = 4) GSL_INTEG_GAUSS51 (key = 5) GSL_INTEG_GAUSS61 (key = 6)
corresponding to the 15, 21, 31, 41, 51 and 61 point Gauss-Kronrod rules. The higher-order rules give better accuracy for smooth functions, while lower-order rules save time when the function contains local difficulties, such as discontinuities.
On each iteration the adaptive integration strategy bisects the interval with the largest error estimate. The subintervals and their results are stored in the memory provided by workspace. The maximum number of subintervals is given by limit, which may not exceed the allocated size of the workspace.
The presence of an integrable singularity in the integration region causes an adaptive routine to concentrate new subintervals around the singularity. As the subintervals decrease in size the successive approximations to the integral converge in a limiting fashion. This approach to the limit can be accelerated using an extrapolation procedure. The QAGS algorithm combines adaptive bisection with the Wynn epsilon-algorithm to speed up the integration of many types of integrable singularities.
This function applies the Gauss-Kronrod 21-point integration rule adaptively until an estimate of the integral of f over (a,b) is achieved within the desired absolute and relative error limits, epsabs and epsrel. The results are extrapolated using the epsilon-algorithm, which accelerates the convergence of the integral in the presence of discontinuities and integrable singularities. The function returns the final approximation from the extrapolation, result, and an estimate of the absolute error, abserr. The subintervals and their results are stored in the memory provided by workspace. The maximum number of subintervals is given by limit, which may not exceed the allocated size of the workspace.
This function applies the adaptive integration algorithm QAGS taking account of the user-supplied locations of singular points. The array pts of length npts should contain the endpoints of the integration ranges defined by the integration region and locations of the singularities. For example, to integrate over the region (a,b) with break-points at x_1, x_2, x_3 (where a < x_1 < x_2 < x_3 < b) the following pts array should be used
pts[0] = a pts[1] = x_1 pts[2] = x_2 pts[3] = x_3 pts[4] = b
with npts = 5.
If you know the locations of the singular points in the integration
region then this routine will be faster than QAGS.
This function computes the integral of the function f over the infinite interval (-\infty,+\infty). The integral is mapped onto the interval (0,1] using the transformation x = (1-t)/t,
It is then integrated using the QAGS algorithm. The normal 21-point Gauss-Kronrod rule of QAGS is replaced by a 15-point rule, because the transformation can generate an integrable singularity at the origin. In this case a lower-order rule is more efficient.
This function computes the integral of the function f over the semi-infinite interval (a,+\infty). The integral is mapped onto the interval (0,1] using the transformation x = a + (1-t)/t,
and then integrated using the QAGS algorithm.
and then integrated using the QAGS algorithm.
This function computes the Cauchy principal value of the integral of f over (a,b), with a singularity at c,
The adaptive bisection algorithm of QAG is used, with modifications to ensure that subdivisions do not occur at the singular point x = c. When a subinterval contains the point x = c or is close to it then a special 25-point modified Clenshaw-Curtis rule is used to control the singularity. Further away from the singularity the algorithm uses an ordinary 15-point Gauss-Kronrod integration rule.
The QAWS algorithm is designed for integrands with algebraic-logarithmic singularities at the end-points of an integration region. In order to work efficiently the algorithm requires a precomputed table of Chebyschev moments.
This function allocates space for a gsl_integration_qaws_table
struct and associated workspace describing a singular weight function
W(x) with the parameters (\alpha, \beta, \mu, \nu),
where \alpha < -1, \beta < -1, and \mu = 0, 1, \nu = 0, 1. The weight function can take four different forms depending on the values of \mu and \nu,
The singular points (a,b) do not have to be specified until the integral is computed, where they are the endpoints of the integration range.
The function returns a pointer to the newly allocated
gsl_integration_qaws_table if no errors were detected, and 0 in
the case of error.
gsl_integration_qaws_table struct t.
gsl_integration_qaws_table struct t.
This function computes the integral of the function f(x) over the interval (a,b) with the singular weight function (x-a)^\alpha (b-x)^\beta \log^\mu (x-a) \log^\nu (b-x). The parameters of the weight function (\alpha, \beta, \mu, \nu) are taken from the table t. The integral is,
The adaptive bisection algorithm of QAG is used. When a subinterval contains one of the endpoints then a special 25-point modified Clenshaw-Curtis rule is used to control the singularities. For subintervals which do not include the endpoints an ordinary 15-point Gauss-Kronrod integration rule is used.
The QAWO algorithm is designed for integrands with an oscillatory factor, \sin(\omega x) or \cos(\omega x). In order to work efficiently the algorithm requires a table of Chebyschev moments which must be pre-computed with calls to the functions below.
This function allocates space for a gsl_integration_qawo_table
struct and its associated workspace describing a sine or cosine weight
function W(x) with the parameters (\omega, L),
The parameter L must be the length of the interval over which the function will be integrated L = b - a. The choice of sine or cosine is made with the parameter sine which should be chosen from one of the two following symbolic values:
GSL_INTEG_COSINE GSL_INTEG_SINE
The gsl_integration_qawo_table is a table of the trigonometric
coefficients required in the integration process. The parameter n
determines the number of levels of coefficients that are computed. Each
level corresponds to one bisection of the interval L, so that
n levels are sufficient for subintervals down to the length
L/2^n. The integration routine gsl_integration_qawo
returns the error GSL_ETABLE if the number of levels is
insufficient for the requested accuracy.
This function uses an adaptive algorithm to compute the integral of f over (a,b) with the weight function \sin(\omega x) or \cos(\omega x) defined by the table wf.
The results are extrapolated using the epsilon-algorithm to accelerate the convergence of the integral. The function returns the final approximation from the extrapolation, result, and an estimate of the absolute error, abserr. The subintervals and their results are stored in the memory provided by workspace. The maximum number of subintervals is given by limit, which may not exceed the allocated size of the workspace.
Those subintervals with "large" widths d, d\omega > 4 are computed using a 25-point Clenshaw-Curtis integration rule, which handles the oscillatory behavior. Subintervals with a "small" width d\omega < 4 are computed using a 15-point Gauss-Kronrod integration.
This function attempts to compute a Fourier integral of the function f over the semi-infinite interval [a,+\infty).
The parameter \omega is taken from the table wf (the length L can take any value, since it is overridden by this function to a value appropriate for the fourier integration). The integral is computed using the QAWO algorithm over each of the subintervals,
where c = (2 floor(|\omega|) + 1) \pi/|\omega|. The width c is chosen to cover an odd number of periods so that the contributions from the intervals alternate in sign and are monotonically decreasing when f is positive and monotonically decreasing. The sum of this sequence of contributions is accelerated using the epsilon-algorithm.
This function works to an overall absolute tolerance of abserr. The following strategy is used: on each interval C_k the algorithm tries to achieve the tolerance
where u_k = (1 - p)p^{k-1} and p = 9/10. The sum of the geometric series of contributions from each interval gives an overall tolerance of abserr.
If the integration of a subinterval leads to difficulties then the accuracy requirement for subsequent intervals is relaxed,
where E_k is the estimated error on the interval C_k.
The subintervals and their results are stored in the memory provided by workspace. The maximum number of subintervals is given by limit, which may not exceed the allocated size of the workspace. The integration over each subinterval uses the memory provided by cycle_workspace as workspace for the QAWO algorithm.
In addition to the standard error codes for invalid arguments the functions can return the following values,
GSL_EMAXITER
GSL_EROUND
GSL_ESING
GSL_EDIVERGE
The integrator QAGS will handle a large class of definite
integrals. For example, consider the following integral, which has a
algebraic-logarithmic singularity at the origin,
The program below computes this integral to a relative accuracy bound of
1e-7.
#include <stdio.h>
#include <math.h>
#include <gsl/gsl_integration.h>
double f (double x, void * params) {
double alpha = *(double *) params;
double f = log(alpha*x) / sqrt(x);
return f;
}
int
main (void)
{
gsl_integration_workspace * w
= gsl_integration_workspace_alloc(1000);
double result, error;
double expected = -4.0;
double alpha = 1.0;
gsl_function F;
F.function = &f;
F.params = α
gsl_integration_qags (&F, 0, 1, 0, 1e-7, 1000,
w, &result, &error);
printf("result = % .18f\n", result);
printf("exact result = % .18f\n", expected);
printf("estimated error = % .18f\n", error);
printf("actual error = % .18f\n", result - expected);
printf("intervals = %d\n", w->size);
return 0;
}
The results below show that the desired accuracy is achieved after 8 subdivisions.
bash$ ./a.out result = -3.999999999999973799 exact result = -4.000000000000000000 estimated error = 0.000000000000246025 actual error = 0.000000000000026201 intervals = 8
In fact, the extrapolation procedure used by QAGS produces an
accuracy of almost twice as many digits. The error estimate returned by
the extrapolation procedure is larger than the actual error, giving a
margin of safety of one order of magnitude.
The following book is the definitive reference for QUADPACK, and was written by the original authors. It provides descriptions of the algorithms, program listings, test programs and examples. It also includes useful advice on numerical integration and many references to the numerical integration literature used in developing QUADPACK.
The library provides a large collection of random number generators which can be accessed through a uniform interface. Environment variables allow you to select different generators and seeds at runtime, so that you can easily switch between generators without needing to recompile your program. Each instance of a generator keeps track of its own state, allowing the generators to be used in multi-threaded programs. Additional functions are available for transforming uniform random numbers into samples from continuous or discrete probability distributions such as the Gaussian, log-normal or Poisson distributions.
These functions are declared in the header file `gsl_rng.h'.
In 1988, Park and Miller wrote a paper entitled "Random number generators: good ones are hard to find." [Commun. ACM, 31, 1192--1201]. Fortunately, some excellent random number generators are available, though poor ones are still in common use. You may be happy with the system-supplied random number generator on your computer, but you should be aware that as computers get faster, requirements on random number generators increase. Nowadays, a simulation that calls a random number generator millions of times can often finish before you can make it down the hall to the coffee machine and back.
A very nice review of random number generators was written by Pierre L'Ecuyer, as Chapter 4 of the book: Handbook on Simulation, Jerry Banks, ed. (Wiley, 1997). The chapter is available in postscript from L'Ecuyer's ftp site (see references). Knuth's volume on Seminumerical Algorithms (originally published in 1968) devotes 170 pages to random number generators, and has recently been updated in its 3rd edition (1997). It is brilliant, a classic. If you don't own it, you should stop reading right now, run to the nearest bookstore, and buy it.
A good random number generator will satisfy both theoretical and statistical properties. Theoretical properties are often hard to obtain (they require real math!), but one prefers a random number generator with a long period, low serial correlation, and a tendency not to "fall mainly on the planes." Statistical tests are performed with numerical simulations. Generally, a random number generator is used to estimate some quantity for which the theory of probability provides an exact answer. Comparison to this exact answer provides a measure of "randomness".
It is important to remember that a random number generator is not a "real" function like sine or cosine. Unlike real functions, successive calls to a random number generator yield different return values. Of course that is just what you want for a random number generator, but to achieve this effect, the generator must keep track of some kind of "state" variable. Sometimes this state is just an integer (sometimes just the value of the previously generated random number), but often it is more complicated than that and may involve a whole array of numbers, possibly with some indices thrown in. To use the random number generators, you do not need to know the details of what comprises the state, and besides that varies from algorithm to algorithm.
The random number generator library uses two special structs,
gsl_rng_type which holds static information about each type of
generator and gsl_rng which describes an instance of a generator
created from a given gsl_rng_type.
The functions described in this section are declared in the header file `gsl_rng.h'.
gsl_rng * r = gsl_rng_alloc (gsl_rng_taus);
If there is insufficient memory to create the generator then the
function returns a null pointer and the error handler is invoked with an
error code of GSL_ENOMEM.
The generator is automatically initialized with the default seed,
gsl_rng_default_seed. This is zero by default but can be changed
either directly or by using the environment variable GSL_RNG_SEED
(see section Random number environment variables).
The details of the available generator types are described later in this chapter.
ranlux generator used a seed
of 314159265, and so choosing s equal to zero reproduces this when
using gsl_rng_ranlux.
The following functions return uniformly distributed random numbers, either as integers or double precision floating point numbers. To obtain non-uniform distributions see section Random Number Distributions.
gsl_rng_max (r) and gsl_rng_min (r).
gsl_rng_get(r) by gsl_rng_max(r) + 1.0 in double
precision. Some generators compute this ratio internally so that they
can provide floating point numbers with more than 32 bits of randomness
(the maximum number of bits that can be portably represented in a single
unsigned long int).
gsl_rng_uniform until a non-zero value is obtained. You can use
this function if you need to avoid a singularity at 0.0.
If n is larger than the range of the generator then the function
calls the error handler with an error code of GSL_EINVAL and
returns zero.
The following functions provide information about an existing generator. You should use them in preference to hard-coding the generator parameters into your own code.
printf("r is a '%s' generator\n",
gsl_rng_name (r));
would print something like r is a 'taus' generator.
gsl_rng_max returns the largest value that gsl_rng_get
can return.
gsl_rng_min returns the smallest value that gsl_rng_get
can return. Usually this value is zero. There are some generators with
algorithms that cannot return zero, and for these generators the minimum
value is 1.
void * state = gsl_rng_state (r); size_t n = gsl_rng_size (r); fwrite (state, n, 1, stream);
const gsl_rng_type **t, **t0;
t0 = gsl_rng_types_setup ();
printf("Available generators:\n");
for (t = t0; *t != 0; t++)
{
printf("%s\n", (*t)->name);
}
The library allows you to choose a default generator and seed from the
environment variables GSL_RNG_TYPE and GSL_RNG_SEED and
the function gsl_rng_env_setup. This makes it easy try out
different generators and seeds without having to recompile your program.
GSL_RNG_TYPE and
GSL_RNG_SEED and uses their values to set the corresponding
library variables gsl_rng_default and
gsl_rng_default_seed. These global variables are defined as
follows,
extern const gsl_rng_type *gsl_rng_default extern unsigned long int gsl_rng_default_seed
The environment variable GSL_RNG_TYPE should be the name of a
generator, such as taus or mt19937. The environment
variable GSL_RNG_SEED should contain the desired seed value. It
is converted to an unsigned long int using the C library function
strtoul.
If you don't specify a generator for GSL_RNG_TYPE then
gsl_rng_mt19937 is used as the default. The initial value of
gsl_rng_default_seed is zero.
Here is a short program which shows how to create a global
generator using the environment variables GSL_RNG_TYPE and
GSL_RNG_SEED,
#include <stdio.h>
#include <gsl/gsl_rng.h>
gsl_rng * r; /* global generator */
int
main (void)
{
const gsl_rng_type * T;
gsl_rng_env_setup();
T = gsl_rng_default;
r = gsl_rng_alloc (T);
printf("generator type: %s\n", gsl_rng_name (r));
printf("seed = %u\n", gsl_rng_default_seed);
printf("first value = %u\n", gsl_rng_get (r));
return 0;
}
Running the program without any environment variables uses the initial
defaults, an mt19937 generator with a seed of 0,
bash$ ./a.out generator type: mt19937 seed = 0 first value = 2867219139
By setting the two variables on the command line we can change the default generator and the seed,
bash$ GSL_RNG_TYPE="taus" GSL_RNG_SEED=123 ./a.out GSL_RNG_TYPE=taus GSL_RNG_SEED=123 generator type: taus seed = 123 first value = 2720986350
The above methods ignore the random number `state' which changes from call to call. It is often useful to be able to save and restore the state. To permit these practices, a few somewhat more advanced functions are supplied. These include:
stdout. At the moment its only use is for debugging.
The functions described above make no reference to the actual algorithm used. This is deliberate so that you can switch algorithms without having to change any of your application source code. The library provides a large number of generators of different types, including simulation quality generators, generators provided for compatibility with other libraries and historical generators from the past.
The following generators are recommended for use in simulation. They have extremely long periods, low correlation and pass most statistical tests.
gsl_rng_set reproduces this.
For more information see,
The generator gsl_rng_19937 uses the second revision of the
seeding procedure published by the two authors above in 2002. The
original seeding procedures could cause spurious artifacts for some seed
values. They are still available through the alternate generators
gsl_rng_mt19937_1999 and gsl_rng_mt19937_1998.
The generator ranlxs0 is a second-generation version of the
RANLUX algorithm of L@"uscher, which produces "luxury random
numbers". This generator provides single precision output (24 bits) at
three luxury levels ranlxs0, ranlxs1 and ranlxs2.
It uses double-precision floating point arithmetic internally and can be
significantly faster than the integer version of ranlux,
particularly on 64-bit architectures. The period of the generator is
about @c{$10^{171}$}
10^171. The algorithm has mathematically proven properties and
can provide truly decorrelated numbers at a known level of randomness.
The higher luxury levels provide additional decorrelation between samples
as an additional safety margin.
These generators produce double precision output (48 bits) from the
RANLXS generator. The library provides two luxury levels
ranlxd1 and ranlxd2.
The ranlux generator is an implementation of the original
algorithm developed by L@"uscher. It uses a
lagged-fibonacci-with-skipping algorithm to produce "luxury random
numbers". It is a 24-bit generator, originally designed for
single-precision IEEE floating point numbers. This implementation is
based on integer arithmetic, while the second-generation versions
RANLXS and RANLXD described above provide floating-point
implementations which will be faster on many platforms.
The period of the generator is about @c{$10^{171}$}
10^171. The algorithm has mathematically proven properties and
it can provide truly decorrelated numbers at a known level of
randomness. The default level of decorrelation recommended by L@"uscher
is provided by gsl_rng_ranlux, while gsl_rng_ranlux389
gives the highest level of randomness, with all 24 bits decorrelated.
Both types of generator use 24 words of state per generator.
For more information see,
where the two underlying generators x_n and y_n are,
with coefficients a_1 = 0, a_2 = 63308, a_3 = -183326, b_1 = 86098, b_2 = 0, b_3 = -539608, and moduli m_1 = 2^31 - 1 = 2147483647 and m_2 = 2145483479.
The period of this generator is 2^205 (about 10^61). It uses 6 words of state per generator. For more information see,
with a_1 = 107374182, a_2 = a_3 = a_4 = 0, a_5 = 104480 and m = 2^31 - 1.
The period of this generator is about 10^46. It uses 5 words of state per generator. More information can be found in the following paper,
where,
computed modulo
2^32. In the formulas above
^^
denotes "exclusive-or". Note that the algorithm relies on the properties
of 32-bit unsigned integers and has been implemented using a bitmask
of 0xFFFFFFFF to make it work on 64 bit machines.
The period of this generator is @c{$2^{88}$} 2^88 (about 10^26). It uses 3 words of state per generator. For more information see,
The generator gsl_rng_taus2 uses the same algorithm as
gsl_rng_taus but with an improved seeding procedure described in
the paper,
The generator gsl_rng_taus2 should now be used in preference to
gsl_rng_taus.
gfsr4 generator is like a lagged-fibonacci generator, and
produces each number as an xor'd sum of four previous values.
Ziff (ref below) notes that "it is now widely known" that two-tap registers (such as R250, which is described below) have serious flaws, the most obvious one being the three-point correlation that comes from the definition of the generator. Nice mathematical properties can be derived for GFSR's, and numerics bears out the claim that 4-tap GFSR's with appropriately chosen offsets are as random as can be measured, using the author's test.
This implementation uses the values suggested the example on p392 of Ziff's article: A=471, B=1586, C=6988, D=9689.
If the offsets are appropriately chosen (such the one ones in this implementation), then the sequence is said to be maximal. I'm not sure what that means, but I would guess that means all states are part of the same cycle, which would mean that the period for this generator is astronomical; it is (2^K)^D \approx 10^{93334} where K=32 is the number of bits in the word, and D is the longest lag. This would also mean that any one random number could easily be zero; ie 0 <= r < 2^32.
Ziff doesn't say so, but it seems to me that the bits are completely independent here, so one could use this as an efficient bit generator; each number supplying 32 random bits. The quality of the generated bits depends on the underlying seeding procedure, which may need to be improved in some circumstances.
For more information see,
The standard Unix random number generators rand, random
and rand48 are provided as part of GSL. Although these
generators are widely available individually often they aren't all
available on the same platform. This makes it difficult to write
portable code using them and so we have included the complete set of
Unix generators in GSL for convenience. Note that these generators
don't produce high-quality randomness and aren't suitable for work
requiring accurate statistics. However, if you won't be measuring
statistical quantities and just want to introduce some variation into
your program then these generators are quite acceptable.
rand() generator. Its sequence is
with a = 1103515245, c = 12345 and m = 2^31. The seed specifies the initial value, x_1. The period of this generator is 2^31, and it uses 1 word of storage per generator.
random() family of functions, a
set of linear feedback shift register generators originally used in BSD
Unix. There are several versions of random() in use today: the
original BSD version (e.g. on SunOS4), a libc5 version (found on
older GNU/Linux systems) and a glibc2 version. Each version uses a
different seeding procedure, and thus produces different sequences.
The original BSD routines accepted a variable length buffer for the
generator state, with longer buffers providing higher-quality
randomness. The random() function implemented algorithms for
buffer lengths of 8, 32, 64, 128 and 256 bytes, and the algorithm with
the largest length that would fit into the user-supplied buffer was
used. To support these algorithms additional generators are available
with the following names,
gsl_rng_random8_bsd gsl_rng_random32_bsd gsl_rng_random64_bsd gsl_rng_random128_bsd gsl_rng_random256_bsd
where the numeric suffix indicates the buffer length. The original BSD
random function used a 128-byte default buffer and so
gsl_rng_random_bsd has been made equivalent to
gsl_rng_random128_bsd. Corresponding versions of the libc5
and glibc2 generators are also available, with the names
gsl_rng_random8_libc5, gsl_rng_random8_glibc2, etc.
rand48 generator. Its sequence is
defined on 48-bit unsigned integers with
a = 25214903917,
c = 11 and
m = 2^48.
The seed specifies the upper 32 bits of the initial value, x_1,
with the lower 16 bits set to 0x330E. The function
gsl_rng_get returns the upper 32 bits from each term of the
sequence. This does not have a direct parallel in the original
rand48 functions, but forcing the result to type long int
reproduces the output of mrand48. The function
gsl_rng_uniform uses the full 48 bits of internal state to return
the double precision number x_n/m, which is equivalent to the
function drand48. Note that some versions of the GNU C Library
contained a bug in mrand48 function which caused it to produce
different results (only the lower 16-bits of the return value were set).
The following generators are provided for compatibility with
Numerical Recipes. Note that the original Numerical Recipes
functions used single precision while we use double precision. This will
lead to minor discrepancies, but only at the level of single-precision
rounding error. If necessary you can force the returned values to single
precision by storing them in a volatile float, which prevents the
value being held in a register with double or extended precision. Apart
from this difference the underlying algorithms for the integer part of
the generators are the same.
ran0 implements Park and Miller's MINSTD
algo