Let us have a look at the logs in
gustav@bh1 $ pwd /N/B/gustav/tmp gustav@bh1 $ ls bank.0 bank.1 bank.2 bank.3 bank.4 bank.5 bank.6 bank.7 gustav@bh1 $The master's log is on
bank.0. It begins with:
sent row 0 to 1 sent row 1 to 2 sent row 2 to 3 sent row 3 to 4 sent row 4 to 5 sent row 5 to 6 sent row 6 to 7This is followed by the routine work. Answers are received from the workers and new jobs are sent back to them:
received row 1 from 2 sent row 7 to 2 received row 0 from 1 sent row 8 to 1 received row 3 from 4 sent row 9 to 4 ...Eventually the whole computation is complete and workers have to be sent termination messages:
received row 92 from 3 terminated process 3 with tag 100 received row 94 from 5 terminated process 5 with tag 100 received row 95 from 6 terminated process 6 with tag 100 received row 96 from 7 terminated process 7 with tag 100 received row 97 from 4 terminated process 4 with tag 100 received row 98 from 2 terminated process 2 with tag 100 received row 99 from 1 terminated process 1 with tag 100Now let us have a look at the log file of worker number 3:
gustav@bh1 $ cat bank.3 received broadcast from 0 received a message from 0, tag 2 sent row 2 to 0 received a message from 0, tag 10 sent row 10 to 0 received a message from 0, tag 16 sent row 16 to 0 ... received a message from 0, tag 79 sent row 79 to 0 received a message from 0, tag 86 sent row 86 to 0 received a message from 0, tag 92 sent row 92 to 0 received a message from 0, tag 100 exiting on tag 100 gustav@bh1 $The
rowthis worker sends to
0is the row of vector c, i.e., just a single integer.
The idea behind job queue paradigm is that you keep all processes as busy as they can get, even if some have to cope with more load than others, e.g., because they run on faster machines. But this works only if the workers don't have to wait for new jobs from the master process longer than it takes them to execute the jobs themselves. If this happens, the cost of communication outweighs the cost of computation so much that you'll be better off executing the whole computation sequentially.