Project Plan
Some Beowulf clusters are designed so that not all nodes are visible to the external world. In these cases, there is often a "front end" node which connects to the world, and the rest of the nodes are on an internal network. This has the disadvantage for PVM users running on an outside computer who want to add the Beowulf computer to their configuration: the PVM code on their computer will be unable to communicate with the Beowulf nodes on the internal network.
In order to test this port I will compare the performance characteristics of two different cluster configurations. The first configuration will test how my code runs on the original setup using only one network. The second configuration will be setup so that my code can run on both networks using the gateway port to pass messages back and forth.
In both configurations I will start by testing the code on a small number of nodes (2 or 4). Then I will increase the number of nodes so that I can look at scaling issues to find out if this port creates a bottleneck.
I plan to use three benchmark programs that have already been written to test throughput. They are PVMPOV and two programs by Parkbench (Communication benchmark (COMMS1 or COMMS2) and Communication bottleneck (POLY3)). I would like to set them up to using three different types of parallel programming:
Embarrassingly Parallel - A computation that can be divided into a number of completely independent parts which can be executed by a separate processor. A truly embarrassingly parallel computation suggest no communication between separate processes. Results from the slaves do not need to be combined to obtain results.
Data Partitioning - Similar to embarrassingly parallel except that the results have to be combined to obtain the results. Partitioning can either be applied to the data by operating upon the divided data concurrently or can be applied to the functions of a program. The former is called data decomposition and the latter is called functional decomposition. Functional decomposition is not as trivial as the data decomposition so I am going to implement the data decomposition. This should not effect the results???
Pipelined - The problem is divided into a series of tasks that have to be completed one after the other. I have also read that this is similar or the basis of sequentilal programming. Each task will be executed by a separate process or processor. Each stage will contribute to the overall problem and pass on information that is needed for subsequent stages.
Schedule: