Conversion of 'functional' to 'imperative' algorithms: is it possible?

Does anybody know any work done on conversion of 'functional' to 'imperative' algorithms?

I am asking this because, while doing some mental exercises with algorithms to help me understand functional programming better, I realized that any functional algorithm could be converted to an imperative one, in order to save resources and increase performance.

The way I work goes like this: i write down a functional algorithm, then try to optimize it by converting it to an imperative one using assigments. I try to re-use memory space, so as that the algorithm becomes faster, without loosing the meaning (and maybe correctness) of the functional algorithm.

Let me give you an example. Let's just say we have the following simple function:

let f(x) = x + 1

which is used here:

let g(y) 
with
    let a = f(y)
=
    a + 2    

We can see that the above is equal to (assignment is operator ':='; operator ',' is used for sequence of operations):

let g(y) = 
    y := y + 1,    (* reuse variable 'y' *)
    y + 2

In the above example, the variable 'y' is re-used (I am talking about variables here, not value bindings, because I am talking about implementation): instead of using an intermediate variable 'a', the compiler reuses the variable 'y', increases it by 1, then adds '2' to it and returns the result.

Of course the above example is trivial and the relevant optimization may already take place in modern FP compilers. But what about more complex examples? For example, I was thinking about the 'quicksort' algorithm. The functional algorithm goes like this (in a imaginative mixture of functional and imperative syntax, using 'h:t' for head-tail list representation):

(* input is a list with first element as 'head' and the rest of elements as 'tail' *)
let qsort(head:tail)
with
    (* less and greater than nil is nil *)
    let less(e, nil) = nil
    let greater(e, nil) = nil
    
    (* calculate elements of input list 'h:t' less than 'e' *)
    let less(e, h:t) = 
        (if h < e then h else nil) + 
        less(e, t)
        
    (* calculate elements of input list 'h:t' greater than 'e' *)
    let greater(e, h:t) = 
        (if h > e then h else nil) + 
        greater(e, t)
=
    (* the result is: less elements + head + greater elements *)
    qsort(less(head, tail)) + 
    head + 
    qsort(greater(head, tail))

The actual implementation of the algorithm is quite slow, since it copies data around in a new list for each 'invocation' (I am not talking 'reduction' here, because I speak about actual code running on a computer). The ideal would be if the above functional algorithm could be converted automatically to an imperative one that does in-place sorting.

If we examine closely the above quicksort algorithm, we will see that the algorithm does the following things:

  1. get all the elements of the list less than its 'first element' into a 'list A'.
  2. get all the elements of the list greater than its 'first element' into a 'list B'.
  3. sort 'list A'.
  4. sort 'list B'.
  5. the end result is 'list A' + 'first element' + 'list B'.

Let's see if we can speed up the above algorithm (let's name it quicksort A) by using two lists: a list A which is the input list and a list B which is the output list. Instead of creating new lists, we are going to use list B as an intermediate list for sorting elements, without changing the basic logic behind the sorting algorithm.

The 'less' function can be written like this:

let less(e, ha:ta, out hb:tb) =     
    if ha < e then        
        hb := ha,            (* copy value of element 'ha' to 'hb' *)                        
        less(e, ta, tb)      (* process rest of elements of A into tail of B *)
    else     
        less(e, ta, hb:tb)   (* process rest of elements of A into B *)

The above algorithm uses another list B to accumulate the 'less than first element' data of a list. The 'greater' algorithm can be written in the same way.

Now the qsort algorithm becomes:

(* input is list 'ha:ta', output is list B, which must have the same length as 'ha:ta' *)
let qsort(ha:ta, out B)
with
    let temp = List(sizeof(ha:ta))     (* temp is a list with same size as list 'ha:ta' *)
=
    less(ha, ta, temp),                (* accumulate elements of 'ta' less than 'ha' into 'temp' *)     
    qsort(temp, B),                    (* sort temp list into B *)
    B[sizeof(temp)] := ha,             (* copy 'ha' right after 'temp' sorted into B *)
    greater(ha, ta, temp),             (* accumulate elements of 'ta' greater than 'ta' into 'temp' *) 
    qsort(temp, B + sizeof(temp) + 1)  (* sort temp list into B right after the less part and the 'ha' element *) 

The above algorithm (let's name it quicksort B) can be preferrable than the pure one, because the intermediate variable 'temp' can exist on the stack, thus increasing performance of the program and reducing garbage (or be used for compile-time garbage collection). Here is a working C++ implementation:

#include 
using namespace std;

#define ARRAY_SIZE      5

int less(int e, const int in[], int in_size, int out[])
{
    int c = 0;
    for(int i = 0; i < in_size; ++i) {
        if (in[i] < e) {
            out[c] = in[i];
            ++c;
        }
    }
    return c;
}

int greater(int e, const int in[], int in_size, int out[])
{
    int c = 0;
    for(int i = 0; i < in_size; ++i) {
        if (in[i] > e) {
            out[c] = in[i];
            ++c;
        }
    }
    return c;
}

void sort(const int in[], int in_size, int out[])
{
    if (in_size == 0) return;
    int temp[ARRAY_SIZE];
    int c = less(in[0], in + 1, in_size - 1, temp);
    sort(temp, c, out);
    out[c] = in[0];
    int c1 = greater(in[0], in + 1, in_size - 1, temp);
    sort(temp, c1, out + c + 1);
}

void sort(const int in[ARRAY_SIZE], int out[ARRAY_SIZE])
{
    sort(in, ARRAY_SIZE, out);
}

int main(int argc, char* argv[])
{
    int in[ARRAY_SIZE] = {5, 6, 7, 3, 8};
    int out[ARRAY_SIZE];

    sort(in, out);

    for(int i = 0; i < ARRAY_SIZE; i++) {
        cout << out[i] << endl;
    }

    getchar();
    return 0;
}

Is it possible to optimize the algorithm further? could the sorting be done without using a second list? The answer is yes. Let's examine the 'logic' behind the sorting again: we traverse a list twice, once for accumulating the elements 'less' than an element and once for accumulating the elements 'greater' than that element. In order to do the 'less' algorithm 'in-place', we have to 'push' elements less than an element in front of the list while pushing greater elements at the end of the list. The algorithm could be like this:

(* pushes elements of 'h:t' less than 'e' at beginning of list; 
   function greater is not needed, because all the elements after the 'less' part
   automatically belong to 'greater' part
 *)
let less(e, out h:t, out f, out l) =
    if h < e then            (* less element is found *)
        l := f,              (* save value of 'f' into position 'l' *)
        f := h,              (* place less element 'h' at beginning of list *)
        f = next(f),         (* set 'f' to next position in list *)
        l = h,               (* note down the 'free' position *)
        less(e, t, f, l)     (* process the rest of elements of 't' *)
    else
        less(e, t, f, l)     (* less element is not found *)
        
(* quick sort *)        
let qsort(h:t)
with
    let f = h                (* 'f' is the first free entry in the list *)
    let l = h                (* 'l' is the last freed entry in the list *)
= 
    less(h, t, f, l),        (* push less than 'h' elements to beginning of 't' *)
    l := f,                  (* before setting middle element, save the value at middle element position *)
    f := h,                  (* set middle element *)
    qsort(t[begin..f]),      (* sort less part *)
    qsort(t[f+1..end])       (* sort greater part *)
        

Here is the working C++ implementation (let's call this version 'quicksort C'):

#include 
using namespace std;

#define ARRAY_SIZE      5

int less(int e, int a[], int begin, int end, int &f)
{
    int l = f;
    for(int i = begin; i < end; ++i) {
        if (a[i] < e) {
            a[l] = a[f];
            a[f] = a[i];
            l = i;
            ++f;
        }
    }
    return l;
}

void sort(int a[], int begin, int end)
{
    if (end <= begin) return;
    int e = a[begin];
    int c = begin;
    int l = less(e, a, begin + 1, end, c);
    a[l] = a[c];
    a[c] = e;
    sort(a, begin, c);
    sort(a, c + 1, end);
}

void sort(int a[ARRAY_SIZE])
{
    sort(a, 0, ARRAY_SIZE);
}

int main(int argc, char* argv[])
{
    int a[ARRAY_SIZE] = {5, 6, 7, 3, 8};
    
    sort(a);
    
    for(int i = 0; i < ARRAY_SIZE; i++) {
        cout << a[i] << endl;
    }
    
    getchar();
    return 0;
}

After doing all the above, I realized that I converted the purely functional quicksort A to imperative algorithm quicksort B, and then took that algorithm and convert it to quicksort C, by doing some replacements and adjusting the algorithm for overwriting data. The quicksort C algorithm is quite similar to the known partitioning algorithm (if not the same?). I have come to suspect that:

  • such a conversion from 'purely functional' to 'imperative' algorithms could be formalized for any algorithm; then the compiler could handle assignments by automatically recognizing what can be reused and what can't
  • all assignments in an imperative algorithm are optimizations of the relevant 'pure' algorithm(s)

Since there is no formal way of converting a functional algorithm to an imperative one, programming languages put the burden of assignment on the programmer, thus allowing for assignment-related bugs to creep in an application; the actual problem with assignments is that a programmer can easily forget when an assignment affects another part of a program. If a compiler could convert a functional algorithm to an imperative one so as that the least possible amount of resources is spent (least copying / least garbage), then functional programming languages could easily reach imperative languages' performance, while keeping their 'pure' properties.

I searched the web but I did not find a relevant paper. Does anybody know anything about this? I apologise if this is something stupid or if I did some stupid mistake or already proven not feasible, but my interest is great on this subject.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Sort of

all assignments in an imperative algorithm are optimizations of the relevant 'pure' algorithm

Sure, in as much as any step in an imperative program is just a function from State->State. The problem becomes that the underlying 'pure' algorithm can easily be intractable. In particular, destructive mutation operations on data structures with cycles and sharing which can not be directly mapped to lazy functional operations without risk of exponential blowup.

Other Way

He's not asking if he can go from an imperative algorithm to a functional one, though; he's asking about the other direction. To me, it sounds isomorphic to the question "how do good functional language compilers work," given that they're targeting stateful environments like microprocessors and memory. It's a good question, but I confess to not being at all familiar with that portion of the functional language literature.

At some level, FPs have state...

...it's just that this state is wrapped up in the stack of function calls, rather than being external to the function parameters. What CTM calls implicit or declarative state (as opposed to explicit state). The difficulty that I see of unwinding this implicit state and tying to make it explicit, is that the explicit state must be manually patched to follow the path of function calls.

It's not exactly functional -> imperative

In Haskell you can have pure, immutable arrays and define a functional quicksort on them. What you are asking is why cannot a compiler perform a storage analysis (with a parametric proof for arrays of size n) for an execution to termination of this function, and recognize that in fact a mutable array of size n provides the optimal layout and generate the code.

I don't think it's impossible. If it is very hard, then it could be because the semantics of the language does not convey the hints to the compiler such that it can replace an immutable array with a mutable one. In any case, making the equivalence proofs (even by hand for complex programs) could be very difficult.

Linear Types?

I haven't read a lot on the subject, but Baker seemed to like the idea of linear objects, which seem to address the problem for some cases. Of course, this puts the burden of recognising sharing on the programmer instead of the compiler, but is is better than nothing.

Linear types prohibit sharing....

The idea behind linear types is that the type system enforces the invariant that there is only ever a single reference to an object. This means that a value with a linear type can't be aliased at all.

Furthermore, if you know that aliasing can't happen, then you can update a value imperatively, and to the program it will still -look- like it's a functional program.

(* signatures *)

val update : linear char array * int * char -> linear char array

(* code *)

(* 1. Make a linear array *)

let a = linear ['a'; 'b'; 'c'; 'd']

(* 2. Update it. 
      At this point a' = ['a'; 'b'; 'x'; 'd']. *)

let a' = update(a, 2, 'x')

(* We get a type error if we try to reuse a *)

let INCORRECT = update(a, 0, 'z')  (* ERROR *)

(* But we can still use a' once *)

let a'' = update(a', 0, 'z') (* OKAY *)

Copy-on-Write

Looks a bit like the copy-on-write done by Tcl's interpreter (and others?). All values are reference-counted, so you know that if the refcount == 1 then you are the only reference and can destructively update, otherwise you need to make a copy. Works well.

Yeah, that's exactly right. L

Yeah, that's exactly right. Linear types are just a type system for ensuring that the reference count of an object is always 1.

This works when you want to h

This works when you want to have the expressiveness of a basic pure functional solution while having a good portion of the efficiency of more imperative solutions (which is admittedly very useful). Unfortunately (obviously), we still need something else for side-effects in the presence of sharing. I've been trying to find good solutions for a partial specialiser (not too complicated, just some numeric work at first), without great success.

If you need to reason about s

If you need to reason about sharing using a type system, you might want to look at Walker's alias types paper.

If you want to reason about aliasing using a specification language, I'd suggest looking at separation logic

SSA

I suspect that this problem is basically what SSA-based compilers do, when they convert the SSA intermediate representation back into imperative machine code. SSA is an intermediate representation where each variable is assigned to only once - sounds a lot like functional programming, and I think it's been proved isomorphic to continuation-passing style. You may want to check out SSA-based compilers like LLVM or recent versions of GCC, and advanced compiler design textbooks like Stephen Muchnick's Advanced Compiler Design and Implementation.

As Appel said ...

"...sounds a lot like functional programming..."

Andrew W. Appel had something to say about that.

Translating out of SSA

I think this is the wrong track for the original question. However, if you're interested in the problems compilers face when translating out of SSA you might want to read Comparison and Evaluation of Back Translation Algorithms for Static Single Assignment Form.

SSA

From what I can understand, one of the properties of SSA is that it transforms an imperative program into a functional one, because each variable is used only once, and therefore never updated (so referential transparency is achieved). But what about the reverse? The papers posted above give some hints on 'joining' two 'instances' of a variable, but I can not say I was able to follow it completely.

I guess isomorphism and transformation of algorithms is still an area of mathematics/CS not yet fully researched...but I have a feeling that there lies the solution for 'bug-free' programming (to the point that functional programming languages become the de-facto way of developing software).

Converting the program AST to

Converting the program AST to SSA is a imperative->functional transition. The reverse, functional->imperative transition comes when you convert SSA into imperative machine code, i.e. code-generation time. So I'd think that things like register allocation, liveness analysis, flow analysis, etc. would prove handy. And there are pattern-matching techniques to recover things like while() loops and if() statements from the control flow graph (I looked for a link but couldn't find one. Maybe someone else can provide it).

Alternative approach - replace CBV by CBN and vice versa

From Control categories and duality: on the categorical semantics of the lambda-mu calculus:

For instance, a purely functional call-by-value term is mapped to a call-by-name term that relies
almost exclusively on control operators, and vice versa. This observation suggests that, from
a practical point of view, certain algorithms are more naturally formulated in a call-by-value
paradigm, and others in call-by-name.

OTOH, if you are interested in compiling the program as opposed to expressing it in a more natural form, you might want to read on lambda lifting, closure conversion, defunctionalization (and linear types for safe in-place updates).

same concept

I came up with the same concept after looking at the Haskell/C verification of seL4. Idea was to use functional programming to get the algorithm right with all the good tools for that. Then, some stepwise refinement of each low-level function to imperative. Something much more lightweight than a full verification. I could then use all the automated tooling for stuff like C plus their optimizing compilers for default implementation. Original form gets compiled in a correct way to be the reference optimization. If fast one doesn't work, users can try more robust one. Might expose compiler errors too.

Anyway, I had trouble finding any information on the topic. Did you find any good papers or tools during of your research? I'd be interested in them.

any functional algorithm

any functional algorithm could be converted to an imperative one,

Pretty much any compiler for a functional language is doing that, to produce object code that runs on typical hardware.

... in order to save resources and increase performance.

Do you have any evidence the functional algorithm isn't performing well? That is, after compiling through an optimising compiler such as GHC. Compilation techniques such as list fusion, deforestation, cardinality analysis (and others mentioned above) are claimed to arrive at object code as efficient as 'hand crafted'.

Perhaps you should put effort into writing idiomatic functional code, rather than trying to adapt imperative thinking into functional and ending up doing neither well.

Since there is no formal way of converting a functional algorithm to an imperative one ... ?? Turing 1936 showed that the Lambda Calculus and Turing Machine are expressively equivalent. He did that by converting between the two.

Evidence?

That depends on what you mean by performing well. Certainly programs written in Haskell and compiled with GHC do not equal programs written in C/C++ for performance.

Turing Tarpit

Turing 1936 showed that the Lambda Calculus and Turing Machine are expressively equivalent. He did that by converting between the two.

That's the classical Turing Tarpit argument. About which is stated:

54. Beware of the Turing tar-pit in which everything is possible but nothing of interest is easy.

To convert a functional algorithm to a (non-trivial) more efficient imperative one is hard. And for now, imperative algorithms are usually faster since they are more amenable to efficient machine translation.

Unless sufficient breakthroughs are made in AI, for in the general case such a conversion will need an accompanying proof of correctness, the claims of the original author stand.