[plain]

Today’s parallelism Industry standards

OpenMP

MPI

Design goals for Sklml

The traditional approaches to parallelism exhibit major drawbacks

The Sklml answers

Sklml’s parallelism Overview What Sklml is

As a result, Sklml

What Sklml is not

On the other hand,

Programming with skeletons Sklml skeletons What is a skeleton

A skeleton is an OCaml value with type (’a, ’b) skel
(its input is of type ’a and its output is of type ’b).

A skeleton is a function acting on streams (a potentially infinite sequence of data).

The Sklml library provides skeletal combinators which might either

Sklml skeletons The farm skeleton combinator The farm skeleton combinator applies one treatment in parallel to a flow of data. val farm : (’a, ’b) skel * int → (’a, ’b) skel;;


[scale=0.5]farm_skl
Figure : farm (F, 2) skeleton graph

Sklml skeletons The pipeline skeleton combinator The pipeline skeleton combinator modelizes the parallel composition of functions. val ( ||| ) :
  (’a, ’b) skel → (’b, ’c) skel → (’a, ’c) skel;;


[scale=0.5]pipe_skl
Figure : G ||| F skeleton graph

Sklml skeletons The loop skeleton combinator The loop skeleton combinator is a control combinator: it iteratively applies a skeleton on a data until the resulting value negates a given predicate. val loop :
  (’a, bool) skel * (’a, ’a) skel → (’a, ’a) skel;;


[scale=0.33]loop_skl
Figure : loop (P, F) skeleton graph

Sklml skeletons Other skeleton combinators

The &&& skeleton combinator modelizes the parallel application of two functions. val ( &&& ) :
  (’a, ’b) skel → (’c, ’d) skel →
  (’a * ’c, ’b * ’d) skel;; The +++ skeleton combinator modelizes the parallel application of two functions on the elements of the direct sum of two sets. val ( +++ ) :
  (’a, ’c) skel → (’b, ’c) skel →
  ((’a, ’b) sum, ’c) skel;; where sum is the classical direct sum of sets defined as type (’a, ’b) sum = Inl of ’a | Inr of ’b;;

Sklml skeletons Other skeleton combinators

The farm_vector skeleton combinator modelizes the parallel application of a function to the items of a vector. val farm_vector :
  (’a, ’b) skel * int → (’a array, ’b array) skel;; The rails skeleton combinator modelizes the parallel application of a vector of n functions to the n items of an input vector. val rails :
  ((’a, ’b) skel) array → (’a array, ’b array) skel;;

Examples A simple example Introducing the example Problem Find the first element which does not satisify a given property P.

We suppose that P is expensive and must be computed in parallel.

We also have two functions:

This problem is borrowed from the program PrimeGen that generates primes satisfying strong cryptographic properties.

[fragile] A simple example The actual Sklml code

In sequential C, this actually boils down to a simple while loop:

do {
    elm = next_elm(elm);
} while (test_elm(elm) == True);

In Sklml, the program uses the loop skeleton, with a predicate described as a parallel pipeline:

let find_skl nw =
  loop ( farm_vector (test_elm, nw) ||| fold_or,
         next_elms ) in
  ...

The Sklml compiler can compile this program for both sequential and parallel executions. Domain Decomposition problems using Sklml (1)

Sklml was developed to cope with scientific computing problems and in particular domain decomposition problems.

Domain decomposition algorithm A computation needs to be performed on a grid (domain) splitted in different small subdomains.

Domain decomposition algorithms perform a sequence of rounds built of two steps:

  1. each processor run a step of a numerical scheme on its subdomain;
  2. border information is exchanged between processors.

Domain Decomposition problems using Sklml (2)


[scale=0.25]domain_decomp
Figure : Computation using a domain decomposition algorithm

Domain Decomposition problems using Sklml (3)

Sklml provides a library of derived operators written in terms of composition of the basic skeletons.

The make_domain skeleton is specific to decomposition domain algorithms.

Given a vector of skeleton workers, the connectivity of the subdomains, and a stopping criterion, the make_domain skeleton combinator creates a skeleton implementing the appropriate domain decomposition algorithm.

type (’a, ’b) worker_spec =
  (’a border list, ’a * ’b) skel * int list

val make_domain :
  ((’a, ’b) worker_spec) array ->
  (’b array, bool) skel ->
  (’a array, (’a * ’b) array) skel Inside Sklml The Sklml distribution

Sklml is a set of 4 components written both in OCaml and Sklml:




Sklml is free software available at http://sklml.inria.fr/.

Sklml’s key feature (1) Fact Skeletal combinators have simple sequential semantics. As a consequence, two compilation modes are proposed, a sequential interpretation of skeletal combinators and a parallel one.

The two semantics in practice Compile either in parallel mode: sklmlc -mode par code.ml Or in sequential mode: sklmlc -mode seq code.ml

Sklml’s key feature (2) The Sklml system guaranties that:

Hence, the methodoly:

  1. develop and debug using the sequential semantics;
  2. start the heavy parallel computation after changing a flag in the makefile!

Sklml and OCaml 3.12

Due to its high abstraction level, Sklml needs advanced features of the OCaml language:

Interacting with Sklml Sklml and the other languages Sequential parts of Sklml programs can be written:

Already written code can be parallelized with Sklml!
(In particular, closed or complex codes from third party).

Future directions State of the art

Sklml is robust and usable but can be improved:

End That’s all folks!

Code appendix [fragile] Implementing simple helper skeletons

let projl = skl () -> fun (x, _) -> x;;
let projr = skl () -> fun (_, x) -> x;;

let injl = skl () -> fun x -> Inl x;;
let injr = skl () -> fun x -> Inr x;;

[fragile] Implementing a if_then_else skeleton

let dup = skl () -> fun x -> (x, x);;
let to_sum = skl () ->
  fun (x, b) -> if b then Inl x else Inr x
;;

let if_then_else (cond_skl, then_skl, else_skl) =
  dup () ||| (id () *** cond_skl) |||
    to_sum () ||| (then_skl +++ else_skl)
;;

[fragile] Factorial in pure Sklml

let is_gt = skl i -> ( < ) i;;
let con =   skl x -> fun _ -> x;;
let minus = skl i -> fun x -> x - i;;
let mult =  skl () -> fun (a, b) -> a * b;;

let fact =
  dup () ||| (id () *** con 1) |||
  loop
    ( projl () ||| is_gt 1 
    , dup () |||
      ( (projl () ||| minus 1) ***
        mult ()
      )
    ) |||
  projr ()
;;

This document was translated from LATEX by HEVEA.