So far we have used the implementation of the Euler gamma function, , provided with the Solaris version of Fortran (actually, its logarithm). This makes the code non-portable. For example, we couldn't find an equivalent of Euler gamma function in IRIX-6.3 Fortran. Even if we could, the name of the subroutine could well be different.

But the world doesn't end on Solaris. Today, as well as in future, you may have to work with a variety of operating systems and hardware architectures: VMS is still around, so is MVS, and the latter is likely to go on for many years to come.

NT for all practical purposes is restricted to just one hardware architecture and aimed specially at small offices and enterprises. This makes it an unattractive system to work with, if you are a scientist or an engineer, because neither Intel hardware nor Microsoft software are particularly sympathetic towards the needs of scientists and engineers, who represent a niche market. But for one reason or another you may come across those systems in your professional career too. In may institutions and companies clerks and bureaucrats decide about institutional purchases, and, like everybody else, they are likely to be swayed by salesman pitch, propaganda, and whatever box they happen to have sitting on their own desk. Once upon a time Intel boxes used to be markedly cheaper than UNIX workstations. Today this is no longer the case, but many people haven't woken up to this fact yet.

Macs are coming back, and many scientists and engineers like them, because they are easy to use, even though they are just as clumsy and inflexible when it comes to doing science as NT.

Finally, we can't exclude that new architectures and operating systems may appear in future. None of the stuff that we see on the market today is ideal, and there is no reason why people shouldn't work on better hardware architectures and better operating systems.

Anyhow, if you want your Fortran program to compile and work seamlessly across all platforms, you should provide your own definitions for all functions and subroutines called by the program that are not a part of ANSI Fortran.

But there is also another reason to do so. The Euler gamma function is a very complicated function with poles at 0 and all negative integers, defined in the whole complex plane (with the exception of the poles, that is). There are many expressions that approximate the gamma function in various regimes quite well, or sometimes extraordinarily well. A general purpose gamma function provided, for example, with NAG or IMSL libraries, may have numerous branches and invoke various evaluation techniques depending on the argument. If you have to call such a gamma function very frequently, the cost of doing so may be quite high. So, if you know that your arguments are going to be restricted to a certain narrow range, you may be better off writing your own lightweight implementation.

In the case of our program, we really want an inverse of the
gamma function and not the gamma function itself. The inverse is
a lot easier to calculate, because it is a very regular function
without any poles and with only a very moderate ``pulsing'' growth
for *x* < 0.

There are many ways to calculate
.
For example we
could use Euler's infinite product

(2.35) |

where

(2.36) |

A Taylor expansion of this formula around *z* = 0
can be found in ``Tables of Higher Mathematical
Functions'', by H. T. Davis (later corrected by H. E. Salzer),
published here in Bloomington, Indiana,
in 1933 and in 1935. The expansion looks as follows:

where the first 26 coefficients

c_{1} |
= | 1 | . | |

c_{2} |
= | 0 | . | |

c_{3} |
= | -0 | . | |

c_{4} |
= | -0 | . | |

c_{5} |
= | 0 | . | |

c_{6} |
= | -0 | . | |

c_{7} |
= | -0 | . | |

c_{8} |
= | 0 | . | |

c_{9} |
= | -0 | . | |

c_{10} |
= | -0 | . | |

c_{11} |
= | 0 | . | |

c_{12} |
= | -0 | . | |

c_{13} |
= | -0 | . | |

c_{14} |
= | 0 | . | |

c_{15} |
= | -0 | . | |

c_{16} |
= | 0 | . | |

c_{17} |
= | 0 | . | |

c_{18} |
= | -0 | . | |

c_{19} |
= | 0 | . | |

c_{20} |
= | 0 | . | |

c_{21} |
= | -0 | . | |

c_{22} |
= | 0 | . | |

c_{23} |
= | -0 | . | |

c_{24} |
= | -0 | . | |

c_{25} |
= | 0 | . | |

c_{26} |
= | 0 | . |

(2.38) |

to move beyond, say,

The Bloomington expansion may well have been someone's master's dissertation. Recall that in those days all computations would have to be done by hand, or, at best, by using a mechanical calculator that could do additions and subtractions quite well, but was rather cumbersome when it came to multiplications and divisions. For example, to multiply a number by 3, you would have to turn its knob three times, and to divide by 3, you would have to turn the knob three times in the other direction.

Figure 2.17 shows an implementation of this function.

We will not use the Bloomington expansion in our computations, but the
program shown in Figure 2.17 illustrates the
use of array constructors
in Fortran. You can either have
a verbatim constructor, which, for an integer array, `k`,
of dimension 5, say, could look as follows:

k = (/ 1, 2, 3, 4, 5 /)This statement assigns

The same effect
can be accomplished
by using the so called ``implied `DO`''
array constructor:

k = (/ (i, i=1, 5) /)

Lisp programmers should recognise Fortran array constructors as related to array constructors in Common Lisp , although markedly less verbose. In Common Lisp we could say, for example

>(make-array 5 :initial-contents '(1 2 3 4 5)) #(1 2 3 4 5) >(make-array 5 :initial-contents (do ((mylist nil) (i 1 (1+ i))) ((> i 5) mylist) (setq mylist (append mylist (list i))))) #(1 2 3 4 5) >You should appreciate how much faster and cleaner

`k = (/ (i, i=1, 5) /)`

is compared to the Lisp `(do ...)`

form.
On the other hand the Lisp way has a great flexibility about it: you can
insert various Lisp forms including `(do ...)`

basically anywhere
within a function call.
Press, Flannery, Teukolsky, and Vetterling
recommend a formula by
Lanczos , which is a lot cheaper
to compute than the Bloomington
expansion and which is a lot more accurate too. The formula looks
as follows:

= | |||

(2.39) |

The Lanczos formula is valid for

For
,
*n* = 6 (in the Lanczos formula), and for
a specific choice of *c*_{k} (see the code in Figure 2.18),
we get that
*everywhere*
in the half-complex plane
Re(*z*) > 0.

Figure 2.18 shows the implementation of this formula as a
Fortran function `my_gamma`

.

Although it is common to calculate
on entry to
a subroutine by calling
(which in Fortran
would be `2*acos(0.0)`

), this is costly, especially
if the function has to be invoked many times. It is easy enough
to evaluate
with desired accuracy using,
for example, Maxima:

(C2) ev(sqrt(2 * %pi), numer); (D2) 2.5066282746310002 (C3)You can also invoke Octave , which we haven't talked about yet, but which, together with Matlab, we'll begin using quite soon. Octave is a program somewhat similar to Maxima, but specialising in numerical mathematics rather than in symbolic mathematics. In Octave you would calculate as follows:

octave:4> printf("%20.16f\n", sqrt(2*pi)); 2.5066282746310002 octave:5>

We can test our implementation of against the Solaris implementation by running the following simple Fortran program:

program try use euler integer, parameter :: n = 25 real (kind=long), parameter :: x_zero = 1.0_long real (kind=long) :: x integer :: i do i = 1, n x = x_zero + 0.4_long * i write(*, '(1f6.1, 2e20.10)') x, gamma(x), my_gamma(x) end do end program try

Figure 2.19 shows the result of the run.

As you see the resulting agreement is very good down to 10 significant digits.

Function `my_gamma`

illustrated in Figure 2.18 is
not optimal, because it involves two exponentiations and
three multiplications. If we were
to evaluate
instead of itself, we could save some time by replacing logarithms of multiplications
with additions of logarithms and exponentiations with multiplications.

Figure 2.20 shows the implementation of a function that calculates .

The following simple test program can be used to time calls to
`my_gamma`

versus calls to `my_lgamma`

:

program try use euler integer, parameter :: n = 1000000 real (kind=long), parameter :: x_zero = 1.0_long real (kind=long) :: x, throw_away integer :: i do i = 1, n x = x_zero + 0.00001_long * i ! throw_away = exp(my_lgamma(x)) throw_away = my_gamma(x) end do end program tryThis program calls either

`my_gamma`

or `my_lgamma`

million
times. The
version takes 8.7s CPU time on an
Ultra-1 workstation, whereas the
version takes
10.5s. Alas! If you use this same program to test
Solaris' `d_lgamma`

it takes only 1.4s, sic!
This level of optimisation can be achieved only by coding the function directly in the assembler or by utilising its hardware implementation. On many UNIX systems, especially the ones aimed directly at scientific and engineering markets, this may frequently be the case.

Similarly, many window operations are assembler-level or even hardware-level optimised under NT and MacOS.