In this section I will lay the ground for an MPI version of a diffusion
code based on Jacobi iterations. The idea is that a flat rectangular region,
whose dimensions in this token example are going to be
,
will be divided amongst 12 processes, each handling a small square
.
But because it is a differential problem, apart from
the
data squares,
the processes will also have to maintain additional data that is going
to mirror data that corresponds to whatever the neighbouring processes
have in their
squares.
In this example we assume that
we are interested only in one layer of data from the neighbours. Because
every square, with the exception of the ones at the boundary of the
region, is going to have four neighbours, we will have to add either a row
or a column to our
squares in order to accomodate data transferred
from neighbouring processes. Consequently each process will have to
look after a
integer matrix, of which an internal
matrix represents that process' own data, and the boundaries of the matrix
represent data obtained from the neighbours.
All that we are going to do within this program is to
At every stage the master process is going to collect data from all other processes and print it on standard output so as to show the state of the system at one glance.
In order to help you understand the whole procedure, I'm going to show you some of the output of the program first, then I'll list the program for you, and then discuss it in more detail.
When the program starts, right after the processes have formed the new Cartesian topology, oriented themselves within it, but before they exchanged any data, their state looks as follows:
9 9 9 9 10 10 10 10 11 11 11 11 9 9 9 9 10 10 10 10 11 11 11 11 9 9 9 9 10 10 10 10 11 11 11 11 9 9 9 9 10 10 10 10 11 11 11 11 6 6 6 6 7 7 7 7 8 8 8 8 6 6 6 6 7 7 7 7 8 8 8 8 6 6 6 6 7 7 7 7 8 8 8 8 6 6 6 6 7 7 7 7 8 8 8 8 3 3 3 3 4 4 4 4 5 5 5 5 3 3 3 3 4 4 4 4 5 5 5 5 3 3 3 3 4 4 4 4 5 5 5 5 3 3 3 3 4 4 4 4 5 5 5 5 0 0 0 0 1 1 1 1 2 2 2 2 0 0 0 0 1 1 1 1 2 2 2 2 0 0 0 0 1 1 1 1 2 2 2 2 0 0 0 0 1 1 1 1 2 2 2 2
Every process initializes its own
matrix to
its own rank number. The matrices are displayed by the
master process in a way that illustrates the topology
of the whole system, i.e., we have a rectangular
topology
,
process rank 0 is in the lower
left corner, and its neighbours are process rank 1
on the right and process rank 3 above. Process rank
4 sits roughly in the middle and its neighbours are
process rank 1 below, process rank 7 above, process
rank 3 on the left, and process rank 5 on the right.
Now, the processes exchange the content of their inner rows with their neighbours, i.e., process rank 4 sends the content of its row 3 (counted from the bottom) to process number 7, which puts it in its own row 1 (counted from the bottom), and, at the same time receives the content of row 2 from process rank 7, and places it in its own row 4.
So that after this exchange operation has taken place, the matrices look as follows:
9 9 9 9 10 10 10 10 11 11 11 11 9 9 9 9 10 10 10 10 11 11 11 11 9 9 9 9 10 10 10 10 11 11 11 11 6 6 6 6 7 7 7 7 8 8 8 8 9 9 9 9 10 10 10 10 11 11 11 11 6 6 6 6 7 7 7 7 8 8 8 8 6 6 6 6 7 7 7 7 8 8 8 8 3 3 3 3 4 4 4 4 5 5 5 5 6 6 6 6 7 7 7 7 8 8 8 8 3 3 3 3 4 4 4 4 5 5 5 5 3 3 3 3 4 4 4 4 5 5 5 5 0 0 0 0 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4 5 5 5 5 0 0 0 0 1 1 1 1 2 2 2 2 0 0 0 0 1 1 1 1 2 2 2 2 0 0 0 0 1 1 1 1 2 2 2 2
Now we have to repeat this operation, but this time exchanging columns between neighbours. And so process rank 4 will send its column 2 (counted from the left) to process rank 3, which is going to place it in its own column 4 (counted from the left), and, at the same time process rank 3 is going to send its own column 3 to process rank 4, which is going to place it in its own column 1.
And after all this is over, the matrices are going to look as follows:
9 9 9 10 9 10 10 11 10 11 11 11 9 9 9 10 9 10 10 11 10 11 11 11 9 9 9 10 9 10 10 11 10 11 11 11 6 6 6 7 6 7 7 8 7 8 8 8 9 9 9 10 9 10 10 11 10 11 11 11 6 6 6 7 6 7 7 8 7 8 8 8 6 6 6 7 6 7 7 8 7 8 8 8 3 3 3 4 3 4 4 5 4 5 5 5 6 6 6 7 6 7 7 8 7 8 8 8 3 3 3 4 3 4 4 5 4 5 5 5 3 3 3 4 3 4 4 5 4 5 5 5 0 0 0 1 0 1 1 2 1 2 2 2 3 3 3 4 3 4 4 5 4 5 5 5 0 0 0 1 0 1 1 2 1 2 2 2 0 0 0 1 0 1 1 2 1 2 2 2 0 0 0 1 0 1 1 2 1 2 2 2
Observe that now not only does every process know what data is harboured by its neighbours on the left, on the right, above, and below, but they even know the data that somehow made it diagonally, i.e., from the upper left, upper right, lower left, and lower right directions - but the latter is not going to last, because that data came from the mirror regions, so it is not truly representative of the processes' internal state.
Now every process can perform its own Jacobi iteration within its own little patch for a diffusion problem, and set new values to its own data, while keeping the mirrored data unchanged.
Before the next iteration can commence, the data has to be exchanged between the neighbours again, in order to refresh the mirrors.
So now let us have a look at the code: