Distributed Functional Programming with F# MPI Tools for .NET

Introduction

For many years, parallel computing is an important area for research in high performance computing. Super computers dominated the industry all the time. However with the cost of obtaining a fast computer and a fast network, cluster computing considered as a good alternative. High Performance Computing market grew rapidly, mainly because of the clusters intensified. According to a research, clusters represent 50% of the High Performance Computing system revenue at the end of 2005.

The idea of cluster computing is to have many machines on a high-speed network, clusters of computers running the same program. Recently, with the invention and adoption of multi-core CPU systems for desktops, it has become even more important. MPI makes even easier for people to build supercomputers by the usage of powerful computers, high speed networks and powerful libraries.

Message Passing Interface (MPI) is the standard of message passing in a distributed computing environment. Its benefit for researchers is invaluable.

MPICH is an open source, portable implementation of Message Passing Interface (MPI) for developing distributed memory application .

The goal of MPI Tools is to make easy to write programs that runs on a cluster of machines. Also make the transition and the portability easy for existing programs in cluster. Using MPITools, it is possible to create distributed functional applications with F#. Although it is primarily developed for .NET framework, it can run on any CLI implementation.

Implementation

The first step involved to make MPICH available to use for F# platform. A wrapping library is implemented for MPICH. Mainly used MPICH functions made available to F#. Those MPI functions are implemented with the effective usage of types. MPI_Init,, MPI_Comm_size, MPI_Comm_rank, MPI_Finalize, MPI_Send, MPI_Recv, MPI_Abort,  MPI_Barrier, MPI_Bcast , MPI_Gather, MPI_Scatter, MPI_Reduce. In reality, you could write distributed programs with just the first six of those function as I will show on the samples.  When using the library you don’t have to worry about the types and data sizes as you usually do in C programming. The only thing important is the order of communication the same as socket programming.

Because MPICH is unmanaged library, it important to make the data types compatible using the interoperability libraries in .NET framework.  All of the exposed data types and functions are defined in “mpi.h” file in MPICH distribution. If you want to use a different MPI implementation then it is needed to change those functions definitions appropriately based on the documentation.

[]
extern int MPI_Send( void *buf, int count, int MPI_Datatype,
int dest, int tag, int MPI_Comm)

Once the value data types such as int, char, byte, double and float types implemented which are pretty same with C implementation. Next step was to make the reference types of the virtual machine available. Unfortunately not all reference types are possible to send out to wire because of the state or impureness of the type. The types have to be serializable in order to send or receive. To make that possible binary serialization is used and passed as a byte array to the MPI. Implementing the reference types made also possible to pass the functions and lambda functions in to the channel.

For the MPI development, the key factor is the data types. The parties have to agree with the file types. Also the size of the file types should be fixed in order to communicate. However the types are properly handled by the library using the sophisticated type system capabilities. In the programs the order becomes really important. In order to get to the internals of MPI Tools, here is an implementation for standard MPICH type definitions and a type converter for it (shortened for simplicity).

type MPI_Datatype =
| MPI_CHAR           =  0x4c000101
| MPI_SIGNED_CHAR    =  0x4c000118

let private TypeConvert (t) = let res = match (box t) with | : ? byte -> MPI_Datatype.MPI_BYTE | : ? char -> MPI_Datatype.MPI_CHAR | _ -> failwith “not implemented data type Enum.to_int res

The complicated, many parameter function calls in the unmanaged MPICH library becomes powerful function with a few arguments in the .NET library. For instance previously defined 6 argument MPI_Send function becomes a three argument polymorphic function. To make it easy, actually send function becomes in different flavours. Actually most of the communication functions come in different versions for different types. The version below is used for singular types. There are two more flavours one for arrays and other for matrix types.

let send(data,destination, tag)
'a * int * int -> unit 

let sendArray(data : ‘a array,destination,tag) a array * int * int -> unit

let sendMatrix (data : matrix,destination,tag) matrix * int * int -> unit

Similarly, the same pattern goes for the receive function. However, this time it is needed to specify the return type as a generic argument of the function.

let receive<'a>(source,tag)
int*int -> 'a

let receiveArray<'a> (source, tag)
int * int -> 'a array

let receiveMatrix(source, tag)
int * int -> matrix

The other functions of MPI are implemented in a similar manner. You could also check the project as a tutorial as well. The library uses effectively active patterns, discriminated unions, interoperability and other functional structures

Usage

First of all MPICH needs to be installed prior to usage. The library is used just like another .NET library in your programs. However the execution is relatively different than usual. The programs have to be executed using the MPI daemon called “mpiexec”. You could look at more on how to configure a cluster in the MPICH documentation. To run the process in n processor or processes “-n” switch needs to be given as a command line argument followed by the name of the compiled program.

mpiexec -n 2 test.exe

Here is a very simple ping pong application using MPI Tools. You can find more samples on the MPI Tools Source code.

#light
#I @"..\MPITools.Bindings\"
#r @"MPITools.Bindings.dll"
open MPITools
MPI.initialize()

let procSize = MPI.size()
let curProcess =  MPI.rank()

let pingpong() =
if curProcess = 0 then
let i =  0
MPI.send(i,1,0)
let b = MPI.receive(1,0)
()
elif curProcess = 1 then
let b = MPI.receive(0,0)
MPI.send(b+1,0,0)
pingpong()
MPI.finalize()

Conclusion

You can download MPI Tools from codeplex. Using MPI Tools, the distributed programs will be short, expressive and well typed with the help of the glorified type system of F#.

MPI Tools is built with F# 1.9.3.7 Compiler for the .NET Framework 2.0. However it would possibly work with any CLI implementation. In the future, some more MPI functions will be implemented, including some helper functions that hides the imperative style programming. and the side effects.

I hope it will help to solve your high computation problems effectively. Please feel free to ask questions or to contribute to the project.

Have fun!