Tutorial - Parallel Constructs

From Mesham
Jump to navigationJump to search

Tutorial number four - prev :: next

Introduction

In this tutorial we shall look at more advanced parallel constructs as to what were discussed in the Hello world tutorial. There will also be some reference made to the concepts noted in the functions and simple types tutorials too.

Parallel composition

In the Hello world tutorial we briefly saw an example of using parallel composition (||) to control parallelism. Let's now further explore this with some code examples:

#include <io>
#include <string>
#include <parallel>

function void main() {
   {
      var i:=pid();
      print("Hello from PID "+itostring(i)+"\n");
   } || {
      var i:=30;
      var f:=20;
      print("Addition result is "+itostring(i+f)+"\n");
   };
};

Which specifies two blocks of code, both running in parallel (two processes), the first will display a message with the process ID in it, the other process will declare two Int variables and display the result of adding these together. This approach; of specifying code in blocks and then using parallel composition to run the blocks in parallel, on different processes, is a useful one. As a further exercise try rearranging the blocks and view the value of the process ID reported, also add further parallel blocks (via more parallel composition) to do things and look at the results.

Unstructured parallel composition

In the previous example we structured parallel composition by using blocks, it is also possible to run statements in parallel using this composition, although it is important to understand the associativity and precedence of parallel composition and sequential composition when doing so.

#include <io>
#include <string>
#include <parallel>

function void main() {
   var i:=0;
   var j:=0;
   var z:=0;
   var m:=0;
   var n:=0;
   var t:=0;

   {i:=1;j:=1||z:=1;m:=1||n:=1||t:=1;};

   print(itostring(pid())+":: i: "+itostring(i)+", j: "+itostring(j)+", z: "+itostring(z)
      +", m: "+itostring(m)+", n: "+itostring(n)+", t: "+itostring(t)+"\n");
};

This is a nice little code to help figure out what, for each process, is being run. You can further play with this code and tweak it as required. Broadly, we are declaring all the variables to be Ints of zero value and then executing the code in the { } code block followed by the print statement on all processes. Where it gets interesting is when we look at the behaviour inside the code block itself. The assignment i:=1 is executed on all processes, sequentially composed with the rest of the code block, j:=1 is executed just on process 0, whereas at the same time the value of 1 is written to variables z and m on process 1. Process 2 performs the assignment n:=1 and lastly process 3 assigns 1 to variable t. From this example you can understand how parallel composition will behave when unstructured like this - as an exercise add additional code blocks (via braces) and see how that changes the behaviour my specifying explicitly what code belongs where.

The first parallel composition will bind to the statement (or code block) immediately before it and then those after it - hence i:=1 is performed on all processes but those sequentially composed statements after the parallel composition are performed just on one process. Incidentally, if we removed the { } braces around the unstructured parallel block, then the print statement would just be performed on process 3 - if it is not clear why then have an experiment and reread this section to fully understand.

Allocation inference

If we declare a variable to have a specific allocation strategy within a parallel construct then this must be compatible with the scope of that construct. For example:

function void main() {
   group 1,3 {
      var i:Int::allocated[multiple[]];
   };
};

If you compile the following code, then it will work but you get the warning Commgroup type and process list inferred from multiple and parallel scope. So what does this mean? Well we are selecting a group of processes (in this case processes 1 and 3) and declaring variable i to be an Int allocated to all processes; however the processes not in scope (0 and 2) will never know of the existence of i and hence can never be involved with it in any way. Even worse, if we were to synchronise on i then it might cause deadlock on these other processes that have no knowledge of it. Therefore, allocating i to all processes is the wrong thing to do here. Instead, what we really want is to allocate i to a group of processes that in parallel scope using the commgroup type, and if omitted the compiler is clever enough the deduce this, put that behaviour in but warn the programmer that it has done so.

If you modify the type chain of i from Int::allocated[multiple[]] to Int::allocated[multiple[commgroup[]]] and recompile you will see a different warning saying that it has just inferred the process list from parallel scope (and not the type as that is already there.) Now change the type chain to read Int::allocated[multiple[commgroup[1,3]]] and recompile - see that there is no warning as we have explicitly specified the processes to allocate the variable to? It is up to you as a programmer and your style to decide whether you want to explicitly do this or put up with the compiler warnings.

So, what happens if we try to allocate variable i to some process that is not in parallel scope? Modify the type chain of i to read Int::allocated[multiple[commgroup[1,2]]] and recompile - you should see an error now that looks like Process 2 in the commgroup is not in parallel scope. We have the same protection for the single type too:

function void main() {
   group 1,3 {
      var i:Int::allocated[single[on[0]]];
   };
};

If you try to compile this code, then you will get the error Process 0 in the single allocation is not in parallel scope which is because you have attempted to allocate variable i to process 0 but this is not in scope so can never be done. Whilst we have been experimenting with the group parallel construct, the same behaviour is true of all parallel structural constructs.

Nesting parallelism

Is currently disallowed, whilst it can provide more flexibility for the programmer it makes for a more complex language from the designer and compiler writer point of view.

function void main() {
   var p;
   par p from 0 to 3 {
      proc 0 {
         skip;
      };
   };
};

If you compile the following code then it will result in the error Can not currently nest par, proc or group parallel blocks.

Parallelism in other functions

Up until this point we have placed our parallel constructs within the main function, but there is no specific reason for this.

#include <io>

function void main() {
   a();
};

function void a() {
   group 1,3 {
      print("Hello from 1 or 3\n");
   };
};

If you compile and run the following code then you will see that processes 1 and 3 display the message to standard output. An an exercise modify this code to include further functions which have their own parallel constructs in and call them from the main or your own functions.

An important point to bear in mind with this is that a is now a parallel function and there are some points to consider with this. Firstly, all parallel constructs (par, proc and [[Group|group]) are blocking calls - hence all processes must see these, so to avoid deadlock all processes must call the function a. Secondly, as discussed in the previous section, remember how we disallow nested parallelism? Well we relax this restriction here but it is still not safe

#include <io>

function void main() {
   var p;
   par p from 0 to 3 {
      a();
   };
};

function void a() {
   group 1,3 {
      print("Hello from 1 or 3\n");
   };
};

If you compile the following code then it will work, but you will get the warning It might not be wise calling a parallel function from within a parallel block. Running the executable will result in the correct output, but changing a 3 to a 2 in the par loop will result in deadlock. Therefore it is best to avoid this technique in practice.