Tutorial - Functions

From Mesham
Revision as of 15:44, 15 April 2019 by Polas (talk | contribs) (6 revisions imported)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Tutorial number three - prev :: next

Introduction

In this tutorial we will be looking at the use of functions in Mesham, both writing our own functions and calling others. Functional abstraction is a very useful aspect to many languages and allows for one to make their code more manageable. We shall also take a look at how to provide optional command line arguments to some Mesham code.

My first function

#include <io>
#include <string>

function Int myAddFunction(var a:Int, var b:Int) {
   return a+b;
};

function void main() {
   var a:=10;
   var c:=myAddFunction(a,20);
   print(itostring(c)+"\n");
};

The above code declares two functions, myAddFunction which takes in two Ints and return an Int (which is the addition of these two numbers) and a main function which is the program entry point. In our main function you can see that we are calling out to the myAddFunction using a mixture of the a variable and the constant value 20. The result of this function is then assigned to variable c which is displayed to standard output.

There are a number of points to note about this - first notice that each function body is terminated via the sequential composition (;) token. This is because all blocks in Mesham must be terminated with some composition and functions are no exception, although it is meaningless to terminate with parallel composition currently. Secondly, move the myAddFunction so that it appears below the main function and recompile - see that it still works? This is because functions in Mesham can be in any order and it is up to the programmer to decide what order makes their code most readable. As an exercise notice that we don't really need variable c at all - remove it and in the print function call replace the reference to c with the call to our own function itself.

Function arguments

By default all element types and records are pass by value, whereas arrays and reference records are pass by reference. This is dependant on the manner in which these data types are allocated, the former using the stack type whereas the later using the heap type. We can determine whether a function's arguments and return value are pass by value or reference by specifying the stack (value), static (value) or heap (reference) type in the chain.

#include <io>
#include <string>

function void main() {
   var a:=10;
   myChangeFunction(a);
   print(itostring(a)+"\n");
};

function void myChangeFunction(var mydata:Int) {
   mydata:=76;
};

If you compile and execute the following code, then you will see the output 10 which is because, by default, an Int is pass by value such that the value of a is passed into myChangeFunction which sets mydata to be equal to this. When we modify mydata, because it has entirely different memory from a then it has no effect upon a.

#include <io>
#include <string>

function void main() {
   var a:=10;
   myChangeFunction(a);
   print(itostring(a)+"\n");
};

function void myChangeFunction(var mydata:Int::heap) {
   mydata:=76;
};

This code snippet is very similar to the previous one, but we have added the heap type to the chain of mydata - if you compile and execute this you will now see the output 76. This is because, by using the heap type, we have changed to pass by reference which means that mydata and a share the same memory and hence a change to one will modify the other. As far as function arguments go, it is fine to have a variable memory allocated by some means and pass it to a function which expects memory in a different form - such as above, where a is (by default) allocated to stack memory but mydata is on heap memory. In such cases Mesham handles the necessary transformations.

The return type

function Int::heap myNewFunction() {
   var a:Int::heap;
   a:=23;
   return a;
};

The code snippet above will return an Int by its reference when the function is called, internal to the function which are creating variable a, allocating it to heap memory, setting the value and returning it. However, an important distinction between the function arguments and function return types is that the memory allocation of what we are returning must match the type. For example, change the type chain in the declaration from Int::heap to Int::stack and recompile - see that there is an error? When we think about this logically it is the only way in which this can work - if we allocate to the stack then the memory is on the current function's stack frame which is destroyed once that function returns; if we were to return a reference to an item on this then that item would no longer exist and bad things would happen! By ensuring that the memory allocations match, we have allocated a to the heap which exists outside of the function calls and will be garbage collected when appropriate.

Leaving a function

Regardless of whether we are returning data from a function or not, we can use the return statement on its own to force leaving that function.

function void myTestFunction(var b:Int) {
   if (b==2) return;
};

In the above code if variable b has a value of 2 then we will leave the function early. Note that we have not followed the conditional by an explicit block - this is allowed (as in many languages) for a single statement.

As an exercise add some value after the return statement so, for example, it reads something like like return 23; - now attempt to recompile and see that you get an error, because in this case we are attempting to return a value when the function's definition reports that it does no such thing.

Command line arguments

The main function also supports the reading of command line arguments. By definition you can provide the main function with either no function arguments (as we have seen up until this point) or alternatively two arguments, the first an Int and the second an array of Strings.

#include <io>
#include <string>

function void main(var argc:Int, var argv:array[String]) {
   var i;
   for i from 0 to argc - 1 {
      print(itostring(i)+": "+argv[i]+"\n");
   };
};

Compile and run the above code, with no arguments you will just see the name of the program, if you now supply command line arguments (separated by a space) then these will also be displayed. There are a couple of general points to note about the code above. Firstly, the variable names argc and argv for the command line arguments are the generally accepted names to use - although you can call these variables what ever you want if you are so inclined.

Secondly notice how we only tell the array type that is is a collection of Strings and not any information about its dimensions, this is allowed in a function argument's type as we don't always know the size, but will limit us to one dimension and stop any error checking from happening on the index bounds used to access elements. Lastly see how we are looping from 0 to argc - 1, the for loop is inclusive of the bounds so argc were zero then one iteration would still occur which is not what we want here.