Tutorial - Hello world

From Mesham
Jump to navigationJump to search

Tutorial number one - next

Introduction

In this tutorial we will have a look at writing, compiling and running our first Mesham parallel code. You will see and introduction as to how we structure a program code, use the standard functions and discuss different forms of parallel structure. This tutorial assumes that you have gotten the Mesham compiler and runtime library installed and working on your machine as per the instructions here.

Hello world

#include <io>
#include <parallel>
#include <string>

function void main() {
   var p;
   par p from 0 to 3 {
      print("Hello world from pid="+itostring(pid())+" with p="+itostring(p)+"\n");
   };
};

Compilating and execution

Copy and paste this code into a text file and name is test.mesh - of course it can be called anything but we will assume this name in the tutorial. Compile via issuing the command mcc test.mesh which will report any errors (there should be none with this example) and produce an executable, in this case named test. In order to run the code you will need to issue the command mpiexec -np 1 ./test - this invokes the MPI process manager with one process and runs the executable. Mesham is designed such that, if run with one process only then it will spawn any other processes it needs. However, the code can only be run with the correct number of processes or one - any other number is assumed to be a mistake and will result in an error message.

In running the code you should see the output although the order of the lines may be different:

Hello world from pid=0 with p=0
Hello world from pid=2 with p=2
Hello world from pid=1 with p=1
Hello world from pid=3 with p=3

A look under the bonnet

Let's take a further look at the code and see exactly what it is doing then. Lines 1 to 3 are including standard function headers - we are using function calls in the program from all three of these sub libraries (print from io, pid from parallel and itostring from string.) By wrapping in the < > braces tells the preprocessor to first look for system includes (as these are.)

Line 5 declares the main function which is the program entry point and all compiled codes that you wish to execute require this function. Only a limited number of items such as type and program variable declaration may appear outside of a function body. At line 6 we are declaring the variable p, but at this point we have opted to provide no further information (such as the type) because this can be deduced on the next line. Line 7 we are using the par keyword to declare a parallel loop (the parallel equivalent of a for loop) which is basically saying execute this loop from 0 to 3 (4) times in parallel running each iteration within its own process.

Line 8 is executed by four, independent processes, each calling the print function to display a message to standard out. The return value of the pid function, which provides us with the current processes absolute id, and the variable p are Int (the later found because p is used in the par statement. It is only possible to print out Strings, so the itostring function is called to convert between an integer and string value.

At this point it is worth noting two aspect of this code. The first (and very important) one is that all blocks are delimited by sequential composition (;). This is because, in a parallel language, it is important to make explicit whether the blocks are executed one after another (sequentially) or at the same time (parallel.) Secondly, see how we have displayed both the process id (via the pid function call) and the value of variable p. Whilst in this simple example it is probably the case, there is no guarantee that these will be equal - the language will allocate the iterations of a par loop to the processes which it sees fit.

Making things more interesting

We are now going to make things a little more interesting and build upon what we have just seen. You will have just read that the par loop assigns iterations to the processes which it feels is more appropriate - we are now going to have a look at this in more detail.

#include <io>
#include <parallel>
#include <string>

function void main() {
   var p;
   skip ||
   par p from 0 to 3 {
      print("Hello world from pid="+itostring(pid())+" with p="+itostring(p)+"\n");
   };
};

Now compile and execute this code in the same manner as described above, you should see some output similiar to (but with a different ordering):

Hello world from pid=1 with p=0
Hello world from pid=2 with p=1
Hello world from pid=4 with p=3
Hello world from pid=3 with p=2

So what's going on here? Well the output it telling us that the first iteration of the par loop is running on process 1, the second on process 2 etc... The reason for this is the use of parallel composition (||) on line 7. At this line we are in effect saying Do nothing using the skip command and at the same time run the par loop. In fact a par loop is syntactic short cut for lots of parallel compositions (in this case we could replace the par loop with four parallel compositions, although the code would look really messy!)

Absolute process selection

We have already said that the par loop does not make any guarantee as to what iteration is placed upon what process. However, sometimes it is useful to know exactly what is running where. To this end we have two constructs the proc and group statements.

Single process selection

To select a single process absolutely by its ID number you can use the proc statement. The following code illustrates this:

#include <io>

function void main() {
   proc 0 {
      print("Hello from process 0\n");
   };

   proc 1 {
      print("Hello from process 1\n");
   };
};

Which, if you compile and execute, will display two lines of text - the first saying hello from process 0 and the other saying hello from process 1 - although which comes first depends on the speed of the processes and will often vary even between runs!

Group process selection

Whilst the proc statement sounds jolly useful (and it is!) you can imagine if you want to select multiple processes to do the same thing by their absolute process ID then many duplicate proc statements in your code will be quite horrid (and wear out your keyboard!) Instead we supply the group statement which allows the programmer to select multiple processes to execute the same block. Based upon the previous example code:

#include <io>
#include <parallel>
#include <string>

function void main() {
   skip ||
   group 0,1,2,3 {
      print("Hello world from pid="+itostring(pid())+"\n");
   };
};

If you compile and execute this you will get something like:

Hello world from pid=0
Hello world from pid=1
Hello world from pid=2
Hello world from pid=3

See the difference from above? Even though we have the parallel composition here, the group statement selects processes on their absolute process ID, so you can be sure that processes 0, 1, 2 and 3 are executing that block. In fact, process 0 will first run the skip statement and then the group block in this example. One last thing - notice how we had to remove all references to variable p here? Because we are no longer using the par loop, we can not leave the declaration of this variable in the code, as the language has no way to deduce what the type of p will be and would produce an error during compilation (try it!)

But, isn't it a bit annoying having to type in each individual process id into a group statement? That is why we support the texas range (...) in a group to mean the entire range from one numeric to another.

#include <io>
#include <parallel>
#include <string>

function void main() {
   skip ||
   group 0,...,3 {
      print("Hello world from pid="+itostring(pid())+"\n");
   };
};

The above code is pretty much the same as the one before (and should produce the same output) - but see how we have saved ourselves some typing by using the texas range in the group process list. This is especially useful when we are specifying very large ranges of processes but has a number of limits. Firstly the texas range must be between two process ids (it can not appear first or last in the list) and secondly the range must go upwards; so you can not specify the id on the left to be larger or equal to the id on the right.

Summary

Whilst the code we have been looking at here is very simple, in this tutorial we have looked at the four basic parallel constructs which we can use to structure our code and discussed the differences between these. We have also looked at writing a simple Mesham code using the main function and using standard functions via including the appropriate sub libraries.