Next: What this Course is
Up: Introduction
Previous: Expected Level of Skills
The Syllabus
The general syllabus for this course is a continuation
of what has already been covered in
P573
and what is required
by the PhD Qualifying Exam .
Here is the syllabus itself:
 1.
 Basic computer concepts and tools
 (a)

Measuring time on computers
(cf. ``How to Time, Save, and Resubmit LoadLeveler Jobs''
 i.
 wall clock, user, system times
 ii.
 clock resolution and overhead of timing
 iii.
 other basic performance data: page faults,
block I/O operations, memory/space integrals
 (b)
 Data sets and characteristics: ASCII, binary, sequential,
random access, record addressability
(cf. P573)
 (c)
 floating point data: IEEE standard, range and precision,
infinity, +0 and 0, denormalised numbers,
NaN (cf. P573)
 2.
 High performance computer architecture
 (a)
 Organisation of computer memory hierarchy
 i.
 Memory banks and interleaving
 ii.
 Caches and registers
 a.
 Direct mapped, set associative
 b.
 Cache lines
 c.
 Replacement policies
 d.
 Writethrough and writeback
 e.
 Snooping
 iii.
 Enhancing data locality in codes
 iv.
 I/O architectures
 (b)
 Microprocessors
 i.
 CISC versus RISC versus EPISC (Merced)
 ii.
 Pipelining of data and instructions
 iii.
 Superscalar organisation
 iv.
 Address computations and pointers
 (c)
 Supercomputer organisation
 i.
 Vectorisation
(cf. P573)
 ii.
 Shared memory, distributed memory
 iii.
 Task versus data parallelism: SIMD versus MIMD
 iv.
 Topologies: mesh, hypercube, tree
 v.
 Synchronisation and communication: MPI, PVM, blocking
communication, broadcasting
 vi.
 Examples: CM2, CM5, SGI PC, Sun E10000, SP2, Cray T3E,
SGI O2000, HP Exemplar, Fujitsu VPP300, NEC SV6
 3.
 Numerical methods
overview (here the emphasis is on implications
for data structures and mapping of computations to machine
architectures and less on the mathematical analysis of the methods)
 (a)
 Common models, prototypes and implications for computer codes:
for each of these be able to discuss implementation issues,
choice of data structures, performance prediction, impact
on structure of computational kernels
(cf. P573)
 i.
 heat equation: parabolic PDEs
 ii.
 wave equation: hyperbolic PDEs
 iii.
 laplace equation: elliptic PDEs
 iv.
 planetary motion: systems of ODEs
 v.
 double pendulum: chaotic ODEs
 vi.
 Nbody systems: particle methods, reduced order methods
(e.g., BarnesHut, multipole)
 vii.
 signal processing: Fourier analysis (FFT,
butterfly pattern, cf. P573)
 (b)
 Discretisations (cf. P573)
 i.
 time discretisation: explicit versus implicit methods
 ii.
 spatial discretisation: finite differences,
finite elements, finite volumes, spectral elements
 iii.
 uniform and quasiuniform meshes
 iv.
 irregular and adaptive meshing
 v.
 integral equation methods
 (c)
 Sparse matrix data structures and their manipulation
 i.
 operations needed on sparse matrices: matrixvector
products, Gaussian elimination, triangular solvers
 ii.
 coordinatewise, compressed sparse row, modified sparse
row, jagged diagonals
 iii.
 load/store analysis and pipelining for sparse matrix data
structures
 4.
 Performance analysis and improvement
 (a)
 Profiling
 i.
 instrumentation and samplingbased tools:
gprof, tprof, pixie,
CASE tools
 ii.
 interpreting profiling information
 (b)
 Benchmarking, MFLOPS, MIPS, theoretical peak performance
 (c)
 Analysing and improving performance
 i.
 using compiler optimisations
 ii.
 typical techniques: common sense, loop interchange,
unrolling, splitting, blocking, jamming
 iii.
 validation of results
 5.
 Programming language and systems issues in scientific computing
 (a)
 Fortran 90 concepts: vector and array operators, modules
and interfaces, operators and when and how to use them (cf. P573)
 (b)

data parallelism in High Performance Fortran (cf. P573)
 (c)
 languages for interactive
scientific experiments: Matlab, Mathematica, Maple
(cf. P573)
 (d)
 object oriented scientific programming techniques
 (e)
 understanding libraries: templates, macros, BLAS,
LAPACK, and related resources (cf. P573)
 (f)
 the role of the programmer in understanding the compiler,
preprocessors and optimisers
 (g)

programming support for large scientific data bases
(cf. P573)
 (h)
 software support for parallel programs (cf. P573)
 i.
 parallelising compiler techniques:
synchronisation methods, types of data dependencies,
compiler directives
 ii.
 communicating processes: PVM, MPI
 iii.
 POSIX threads (Pthreads)
 6.
 Parallelism in scientific algorithms
 (a)
 Modeling of parallelism in theory and practice
 i.
 speedup
 ii.
 parallel efficiency
 iii.
 Amdahl's law
 iv.
 computation/communication ratio
 (b)
 Parallel algorithmic techniques
 i.
 speedup
 ii.
 recursive doubling, parallel prefix
 iii.
 divideandconquer
 iv.
 domain decomposition
 v.
 data distribution (cf. P573)
 (c)
 parallel algorithms
 i.
 parallel sorting
 ii.
 basic linear algebra operations (cf. P573)
 iii.
 fast Fourier transforms (cf. P573)
 iv.
 particle methods
As was the case in
P573
we will attempt to cover various points of this syllabus by working
on a couple of larger projects. Whatever science, mathematics,
and computational techniques will be required in those
should be explained in a sufficient detail.
Next: What this Course is
Up: Introduction
Previous: Expected Level of Skills
Zdzislaw Meglicki
20010226