Tutorial - Simple Types

From Mesham
Jump to navigationJump to search

Tutorial number two - prev :: next

Introduction

In this tutorial we will be looking at a simple use of types in Mesham and how we can change what our code is doing just by modifying the type. It is assumed that the reader has already worked through the Hello world tutorial and is familiar with the concepts discussed there.

A question of types

#include <io>
#include <string>

function void main() {
   var a:=78;
   print(itostring(a)+"\n");
};

In the above code snippet we have included the appropriate system headers (for printing and integer to string conversion at line 6), specified our program entry point via the main function and declared variable a to contain the value 78. Whilst this looks very simple (and it is) there are some important type concepts lurking behind the scenes. There are three ways of declaring a variable - via explicit typing, by specifying a value as is the case here and the type will be deduced via inference or by specifying neither and postponing the typing until later on (such as in the Hello world tutorial as with variable p which was inferred to be an Int later on as it was used in a par statement.)

In the code above, via type inference, variable a is deduced to be of type Int and, in the absence of further types, there are a number of other default types associated with an integer; the stack type so specify that it is allocated to the stack frame of the current function, the onesided type which determines that it uses one sided (variable sharing) communication, the allocated type that specifies memory is allocated and lastly the multiple type that specifies that the variable is allocated to all processes. So, by specifying a value the language has deduced, via inference all this behaviour which can be overridden by explicitly using types. Note that these defaults are not just for Ints, they actually apply to all element types.

Type chains

In the previous section we saw that, by default, element types such as Ints have a default set of type behaviour associated with them. These types are combined together to form a chain. The type chain resulting from the use of an Int and these defaults is: Int::onesided::stack::allocated[ multiple[] ]. There are a number of points to note about this chain, firstly the :: operator (the type chaining operator) chains these independent types together and precedence is from right to left - so the behaviour of the types on the right override behaviour of those to the left of them if there is any conflict. For example if we were to append another form of memory allocation, the heap type which allocates memory on the heap, to rightmost end of the chain then this would override the behaviour of the stack type which would be to the left of it.

#include <io>
#include <string>

function void main() {
   var a:Int::stack::onesided::allocated[multiple[]];
   a:=78;
   print(itostring(a)+"\n");
};

The above code is, in terms of runtime behaviour, absolutely identical to the first code example that we have seen - just we have explicitly specified the type of variable a to be the type chain that is inferred in the first example. As you can see, it just saves typing being able to write code without all these explicit types in many cases. It is also important to note that we can associated optional information with these types. For instance, we have provided the multiple type as a parameter to the allocated type. Parameters can be anything (further type chains, values or variables known at compile time) and in the absence of further information it is entirely optional to provide empty [] braces or not.

All type chains must have at least one element type contained within it. Convention has dictated that all element types start with a capitalised first letter (such as Int, Char and Bool) whereas all other types known as compound types start with a lower case first letter (such as stack, multiple and allocated.)

Let's go parallel

So the code we have seen up until this point isn't very exciting when it comes to parallelism. In the following code example we are involving two processes with shared memory communication:

#include <io>
#include <string>

function void main() {
   var a:Int::allocated[single[on[0]]];
   proc 1 {
      a:=78;
   };
   sync;
   proc 0 {
      print("Value: "+itostring(a)+"\n");
   };
};

The important change here has been that we have modified the multiple type to instead be the single type with the on type provided as a parameter and then the value 0 to this type. What this is doing is allocating variable a to the memory of process 0 only. Note how we have also omitted the stack and onesided types - they are still added by default as we have not specified types to control memory or the communication method - but omitting them makes the code more readable.

In the first proc block, process 1 is writing the value 78 to variable a. Because this variable is held on process 0 only and is not local to process 1 this will involve some form of shared memory communication to get that value across (as defined in the onesided communication type which is used by default. Process 0, in the second proc block, will read out the value of variable a and display this to standard output. A very important aspect of this code is found on line 9 and is the sync keyword. The default shared memory communication is not guaranteed to complete until the appropriate synchronisation has occurred. This acts both as a barrier and all processes which need to will then write their values of a to the target remote memory. Synchronisation is Concurrent Read Concurrent Write (CRCW), which means that between synchronisation multiple processes are allowed to read and write to the same locations any number of times, although with writing there is no guarantee which value will be used if they are different in the same step. Additionally you can see how we have specified the variable name after the sync here, this just means to synchronise on that variable alone - if you omit it then it will synchronise on all outstanding variables and their communications.

Exercise: Comment out the synchronisation line and run the code again - see now process 0 reports the value as zero? This is because synchronisation has not occurred and the value has not been written (by default an Int is initialised to the zero value.)

Further parallelism

We have very slightly modified the code below:

#include <io>
#include <string>

var master:=1;
var slave:=0;

function void main() {
   var a:Int::allocated[single[on[master]]];
   proc slave {
      a:=78;
   };
   sync;
   proc master {
      print("Value: "+itostring(a)+"\n");
   };
};

You can see that here we have added in two variables, master and slave, which control where the variable is allocated to and who does the value writing. Try modifying these values, although be warned by changing them to large values will cause the creation of many processes who do nothing as the proc construct will create the preceding processes to honour the process ID; for instance if you specify the master to be 90, then processes 0 to 90 will be created to ensure that the process with ID 90 executes that specific block. The limitation here is that the value of these variables must be known at compile time, so it is fine to specify them in the code like this but that could not, for example, be the result of some user input or command line argument. Also note how we have declared these variables to have global program scope by declaring them outside of the function. Of course we could just have easily placed them inside the main function but this was to illustrate that declaring variables is allowed in global scope outside of a function body.

Changing the type

As the Mesham code runs we can change the type of a variable by modifying the chain, this is illustrated in the following code:

function void main() {
   var a:Int;
   a:=23;
   a:a::const;
   a:=3;
};

Try to compile this - see an error at line 5? Don't worry, that was entirely expected - because we are typing variable a to be an Int (and all the defaults types that go with it), performing an assignment at line 3 which goes ahead fine but then at line 4 we are modifying the type of a via the set type operator : to be the current type of a chained with the const type which forces the variable to be read only. Hence the assignment at line 5 fails because the type of variable a has the const type in the chain. By removing this assignment or the type modification at line 4 the code will compile fine.

Modifying types in this form can be very powerful but there are some points to bear in mind. Firstly it is not possible to modify the allocated type or its contents as we are changing the behaviour of a variable but not if and where it is allocated in memory, doing so will result in an error. Secondly, modifying a type will bind this modification to the local scope and once we leave this scope then the type shall be reverted back to what it was before.

function void main() {
   var a:Int;
   a:=23;
   a::const:=3;
};

It is also possible to modify the type chain of a variable just for a specific assignment or expression. The code above will also fail to compile because the programmer has specified that just for the assignment at line 4, to append the const type to the end of the type chain of variable a. If you remove this type modification then the code is perfectly legal and will compile and execute fine.