Difference between pages "The Arjuna Compiler" and "Tutorial - Hello world"

From Mesham
(Difference between pages)
Jump to navigationJump to search
m (8 revisions imported)
 
 
Line 1: Line 1:
<metadesc>Mesham is a type oriented programming language allowing the writing of high performance parallel codes which are efficient yet simple to write and maintain</metadesc>
+
== Introduction ==
== Overview ==
+
In this tutorial we will have a look at writing, compiling and running our first Mesham parallel code. You will see and introduction as to how we structure a program code, use the standard functions and discuss different forms of parallel structure. This tutorial assumes that you have gotten the Mesham compiler and runtime library installed and working on your machine.
  
''' This page refers to the [[Arjuna]] line of compilers which is up to version 0.5 and is legacy with respect to the latest [[Oubliette]] 1.0 line'''
+
== Hello world ==
  
Although not essential to the programmer, it is quite useful to know the basics of how the implementation hierachy works.
+
#include <io>
 +
#include <parallel>
 +
#include <string>
 +
 +
function void main() {
 +
    var p;
 +
    par p from 0 to 3 {
 +
      print("Hello world from pid="+itostring(pid())+" with p="+itostring(p)+"\n");
 +
    };
 +
};
  
The core translator produces ANSI standard C99 C code which uses the Message Passing Interface (version 2) for communication. Therefore, on the target machine, an implementation of MPI, such as OpenMPI, MPICH or a vendor specific MPI is required and will all work with the generated code. Additionally our runtime library (known as LOGS) needs to be also linked in. The runtime library performs two roles - firstly it is architecture specific (and versions exist for Linux, Windows etc..) as it contains any none portable code which is needed and is also optimised for specific platforms. Secondly the runtime library contains functions which are often called and would increase the size of generated C code.
+
=== Compilating and execution ===
  
<center>[[Image:overview.jpg|Overview of Translation Process]]</center>
+
Copy and paste this code into a text file and name is ''test.mesh'' - of course it can be called anything but we will assume this name in the tutorial. Compile via issuing the command ''mcc test.mesh'' which will report any errors (there should be none with this example) and produce an executable, in this case named ''test''.
 +
In order to run the code you will need to issue the command ''mpiexec -np 1 ./test'' - this invokes the MPI process manager with one process and runs the executable. Mesham is designed such that, if run with one process only then it will spawn any other processes it needs. However, the code can only be run with the correct number of processes or one - any other number is assumed to be a mistake and will result in an error message.
  
The resulting executable can be thought of as any normal executable, and can be run in a number of ways. In order to allow for simplicity the user can execute it by double clicking it, the program will automatically spawn the number of processors required. Secondly the executable can be run via the mpi deamon, and may be instigated via a process file or queue submission program. It should be noted that, as long as your MPI implementation supports multi-core (and the majority of them do) then the code can be executed properly on a multi core machine, often with the processes wrapping around the cores (for instance 2 processes on 2 cores is 1 process on each, 6 processes on 2 cores is 3 processes on each etc...)
+
In running the code you should see the output although the order of the lines may be different:
 +
Hello world from pid=0 with p=0
 +
Hello world from pid=2 with p=2
 +
Hello world from pid=1 with p=1
 +
Hello world from pid=3 with p=3
  
== Translation In More Detail ==
+
=== A look under the bonnet ===
  
The translator itself is contained within a number of different phases. Firstly, your Mesham code goes through a preprocessor, written in Java, which will do a number of jobs, such as adding scoping information. When this is complete it then gets sent to the translation server - from the design of FlexibO, the language we wrote the translator in, the actual translation is performed by a server listening using TCP/IP. This server can be on the local machine, or a remote one, depending exactly on the setup of your network. Once translation has completed, the generated C code is sent back to the client via TCP/IP and from there can be compiled. The most important benefit of this approach is flexibility - most commonly we use Mesham via the command line, however a web based interface also exists, allowing the code to be written without the programmer installing any actual software on their machine.  
+
Let's take a further look at the code and see exactly what it is doing then. Lines 1 to 3 are including standard function headers - we are using function calls in the program from all three of these sub libraries (''print'' from ''io'', ''pid'' from ''parallel'' and ''itostring'' from ''string''.) By wrapping in the < > braces tells the preprocessor to first look for system includes (as these are.)
  
<center>[[Image:flexdetail.jpg|Flexibo translation in detail]]</center>
+
Line 5 declares the main function which is the program entry point and all compiled codes that you wish to execute require this function. Only a limited number of items such as type and program variable declaration may appear outside of a function body. At line 6 we are declaring the variable ''p'', but at this point we have opted to provide no further information (such as the type) because this can be deduced on the next line. Line 7 we are using the [[Par|par]] keyword to declare a parallel loop (the parallel equivalent of a [[For|for]] loop) which is basically saying ''execute this loop from 0 to 3 (4) times in parallel running each iteration within its own process.''
  
== Command Line Options ==
+
Line 8 is executed by four, independent processes, each calling the [[Print|print]] function to display a message to standard out. The return value of the [[Pid|pid]] function, which provides us with the current processes absolute id, and the variable ''p'' are [[Int]] (the later found because ''p'' is used in the [[Par|par]] statement. It is only possible to print out [[String|Strings]], so the [[Itostring|itostring]] function is called to convert between an integer and string value.
  
* '''-o [name]''' ''Select output filename''
+
At this point it is worth noting two aspect of this code. The first (and very important) one is that all blocks are delimited by sequential composition (;). This is because, in a parallel language, it is important to make explicit whether the blocks are executed one after another (sequentially) or at the same time (parallel.) Secondly, see how we have displayed both the process id (via the [[Pid|pid]] function call) and the value of variable ''p''. Whilst in this simple example it is probably the case, there is no guarantee that these will be equal - the language will allocate the iterations of a [[Par|par]] loop to the processes which it sees fit.
* '''-I[dir]''' ''Look in the directory (as well as the current one) for preprocessor files''
 
* '''-c''' ''Output C code only''
 
* '''-t''' ''Just link and output C code''
 
* '''-e''' ''Display C compiler errors and warnings also''
 
* '''-s''' ''Silent operation (no warnings)''
 
* '''-f [args]''' ''Forward Arguments to C compiler''
 
* '''-pp''' ''Just preprocess the Mesham source and output results''
 
* '''-static''' ''Statically link against the runtime library''
 
* '''-shared''' ''Dynamically link against the runtime library (default)''
 
* '''-debug''' ''Display compiler structural warnings before rerunning''
 
  
== Static and Dynamic Linking Against the RTL ==
+
== Making things more interesting ==
  
The option is given to statically or dynamically link against the runtime library. Linking statically will actually place a copy of the RTL within your executable - the advantage is that the RTL need not be installed on the target machine, the executable is completely self contained. Linking dynamically means that the RTL must be on the target machine (and is linked in at runtime), the advantage to this is that the executable is considerably smaller and a change in the RTL need not result in all your code requiring a recompile.
+
We are now going to make things a little more interesting and build upon what we have just seen. You will have just read that the [[Par|par]] loop assigns iterations to the processes which it feels is more appropriate - we are now going to have a look at this in more detail.
 +
 
 +
#include <io>
 +
#include <parallel>
 +
#include <string>
 +
 +
function void main() {
 +
    var p;
 +
    skip ||
 +
    par p from 0 to 3 {
 +
      print("Hello world from pid="+itostring(pid())+" with p="+itostring(p)+"\n");
 +
    };
 +
};
 +
 
 +
Now compile and execute this code in the same manner as described above, you should see some output similiar to (but with a different ordering):
 +
 
 +
Hello world from pid=1 with p=0
 +
Hello world from pid=2 with p=1
 +
Hello world from pid=4 with p=3
 +
Hello world from pid=3 with p=2
 +
 
 +
So what's going on here? Well the output it telling us that the first iteration of the [[Par|par]] loop is running on process 1, the second on process 2 etc... The reason for this is the use of parallel composition (||) on line 7. At this line we are in effect saying ''Do nothing using the skip command and at the same time run the par loop.'' In fact a [[Par|par]] loop is syntactic short cut for lots of parallel compositions (in this case we could replace the par loop with four parallel compositions, although the code would look really messy!)
 +
 
 +
== Absolute process selection ==
 +
 
 +
We have already said that the [[Par|par]] loop does not make any guarantee as to what iteration is placed upon what process. However, sometimes it is useful to know exactly what is running where. To this end we have two constructs the [[Proc|proc]] and [[Group|group]] statements.
 +
 
 +
=== Single process selection ===
 +
 
 +
To select a single process absolutely by its ID number you can use the [[Proc|proc]] statement. The following code illustrates this:
 +
 
 +
#include <io>
 +
 +
function void main() {
 +
    proc 0 {
 +
      print("Hello from process 0\n");
 +
    };
 +
 +
    proc 1 {
 +
      print("Hello from process 1\n");
 +
    };
 +
};
 +
 
 +
Which, if you compile and execute, will display two lines of text - the first saying hello from process 0 and the other saying hello from process 1 - although which comes first depends on the speed of the processes and will often vary even between runs!
 +
 
 +
=== Group process selection ===
 +
 
 +
Whilst the [[Proc|proc]] statement sounds jolly useful (and it is!) you can imagine if you want to select multiple processes to do the same thing by their absolute process ID then many duplicate proc statements in your code will be quite horrid (and wear out your keyboard!) Instead we supply the [[Group|group]] statement which allows the programmer to select multiple processes to execute the same block. Based upon the previous example code:
 +
 
 +
#include <io>
 +
#include <parallel>
 +
#include <string>
 +
 +
function void main() {
 +
    skip ||
 +
    group 0,1,2,3 {
 +
      print("Hello world from pid="+itostring(pid())+"\n");
 +
    };
 +
};
 +
 
 +
If you compile and execute this you will get something like:
 +
 
 +
Hello world from pid=0
 +
Hello world from pid=1
 +
Hello world from pid=2
 +
Hello world from pid=3
 +
 
 +
See the difference from above? Even though we have the parallel composition here, the [[Group|group]] statement selects processes on their absolute process ID, so you can be sure that processes 0, 1, 2 and 3 are executing that block. In fact, process 0 will first run the skip statement and then the group block in this example. One last thing - notice how we had to remove all references to variable ''p'' here? Because we are no longer using the [[Par|par]] loop, we can not leave the declaration of this variable in the code, as the language has no way to deduce what the type of ''p'' will be and would produce an error during compilation (try it!)
 +
 
 +
== Summary ==
 +
 
 +
Whilst the code we have been looking at here is very simple, in this tutorial we have looked at the four basic parallel constructs which we can use to structure our code and discussed the differences between these. We have also looked at writing a simple Mesham code using the main function and using standard functions via including the appropriate sub libraries.
 +
[[Category:Tutorials|Hello world]]

Revision as of 15:19, 14 January 2013

Introduction

In this tutorial we will have a look at writing, compiling and running our first Mesham parallel code. You will see and introduction as to how we structure a program code, use the standard functions and discuss different forms of parallel structure. This tutorial assumes that you have gotten the Mesham compiler and runtime library installed and working on your machine.

Hello world

#include <io>
#include <parallel>
#include <string>

function void main() {
   var p;
   par p from 0 to 3 {
      print("Hello world from pid="+itostring(pid())+" with p="+itostring(p)+"\n");
   };
};

Compilating and execution

Copy and paste this code into a text file and name is test.mesh - of course it can be called anything but we will assume this name in the tutorial. Compile via issuing the command mcc test.mesh which will report any errors (there should be none with this example) and produce an executable, in this case named test. In order to run the code you will need to issue the command mpiexec -np 1 ./test - this invokes the MPI process manager with one process and runs the executable. Mesham is designed such that, if run with one process only then it will spawn any other processes it needs. However, the code can only be run with the correct number of processes or one - any other number is assumed to be a mistake and will result in an error message.

In running the code you should see the output although the order of the lines may be different:

Hello world from pid=0 with p=0
Hello world from pid=2 with p=2
Hello world from pid=1 with p=1
Hello world from pid=3 with p=3

A look under the bonnet

Let's take a further look at the code and see exactly what it is doing then. Lines 1 to 3 are including standard function headers - we are using function calls in the program from all three of these sub libraries (print from io, pid from parallel and itostring from string.) By wrapping in the < > braces tells the preprocessor to first look for system includes (as these are.)

Line 5 declares the main function which is the program entry point and all compiled codes that you wish to execute require this function. Only a limited number of items such as type and program variable declaration may appear outside of a function body. At line 6 we are declaring the variable p, but at this point we have opted to provide no further information (such as the type) because this can be deduced on the next line. Line 7 we are using the par keyword to declare a parallel loop (the parallel equivalent of a for loop) which is basically saying execute this loop from 0 to 3 (4) times in parallel running each iteration within its own process.

Line 8 is executed by four, independent processes, each calling the print function to display a message to standard out. The return value of the pid function, which provides us with the current processes absolute id, and the variable p are Int (the later found because p is used in the par statement. It is only possible to print out Strings, so the itostring function is called to convert between an integer and string value.

At this point it is worth noting two aspect of this code. The first (and very important) one is that all blocks are delimited by sequential composition (;). This is because, in a parallel language, it is important to make explicit whether the blocks are executed one after another (sequentially) or at the same time (parallel.) Secondly, see how we have displayed both the process id (via the pid function call) and the value of variable p. Whilst in this simple example it is probably the case, there is no guarantee that these will be equal - the language will allocate the iterations of a par loop to the processes which it sees fit.

Making things more interesting

We are now going to make things a little more interesting and build upon what we have just seen. You will have just read that the par loop assigns iterations to the processes which it feels is more appropriate - we are now going to have a look at this in more detail.

#include <io>
#include <parallel>
#include <string>

function void main() {
   var p;
   skip ||
   par p from 0 to 3 {
      print("Hello world from pid="+itostring(pid())+" with p="+itostring(p)+"\n");
   };
};

Now compile and execute this code in the same manner as described above, you should see some output similiar to (but with a different ordering):

Hello world from pid=1 with p=0
Hello world from pid=2 with p=1
Hello world from pid=4 with p=3
Hello world from pid=3 with p=2

So what's going on here? Well the output it telling us that the first iteration of the par loop is running on process 1, the second on process 2 etc... The reason for this is the use of parallel composition (||) on line 7. At this line we are in effect saying Do nothing using the skip command and at the same time run the par loop. In fact a par loop is syntactic short cut for lots of parallel compositions (in this case we could replace the par loop with four parallel compositions, although the code would look really messy!)

Absolute process selection

We have already said that the par loop does not make any guarantee as to what iteration is placed upon what process. However, sometimes it is useful to know exactly what is running where. To this end we have two constructs the proc and group statements.

Single process selection

To select a single process absolutely by its ID number you can use the proc statement. The following code illustrates this:

#include <io>

function void main() {
   proc 0 {
      print("Hello from process 0\n");
   };

   proc 1 {
      print("Hello from process 1\n");
   };
};

Which, if you compile and execute, will display two lines of text - the first saying hello from process 0 and the other saying hello from process 1 - although which comes first depends on the speed of the processes and will often vary even between runs!

Group process selection

Whilst the proc statement sounds jolly useful (and it is!) you can imagine if you want to select multiple processes to do the same thing by their absolute process ID then many duplicate proc statements in your code will be quite horrid (and wear out your keyboard!) Instead we supply the group statement which allows the programmer to select multiple processes to execute the same block. Based upon the previous example code:

#include <io>
#include <parallel>
#include <string>

function void main() {
   skip ||
   group 0,1,2,3 {
      print("Hello world from pid="+itostring(pid())+"\n");
   };
};

If you compile and execute this you will get something like:

Hello world from pid=0
Hello world from pid=1
Hello world from pid=2
Hello world from pid=3

See the difference from above? Even though we have the parallel composition here, the group statement selects processes on their absolute process ID, so you can be sure that processes 0, 1, 2 and 3 are executing that block. In fact, process 0 will first run the skip statement and then the group block in this example. One last thing - notice how we had to remove all references to variable p here? Because we are no longer using the par loop, we can not leave the declaration of this variable in the code, as the language has no way to deduce what the type of p will be and would produce an error during compilation (try it!)

Summary

Whilst the code we have been looking at here is very simple, in this tutorial we have looked at the four basic parallel constructs which we can use to structure our code and discussed the differences between these. We have also looked at writing a simple Mesham code using the main function and using standard functions via including the appropriate sub libraries.