Function Readability & Understandability

Two versions of the same function in JavaScript because its a language that supports both the functional and imperative style. The function accepts any number of arrays as arguments, and outputs the permutations of the values so (['a', 'b'],['1', '2']) becomes [['a', '1'], ['a', '2'], ['b', '1'], ['b', '2']]:

function permutations1(x) {
    return x.reduce(function(a0, v0) {
        return a0.reduce(function(a1, v1) {
            return a1.concat(v0.map(function(v2) {
                return v1.concat([v2]);
            }));
        }, []);
    }, [[]]);
}

function permutations2(x) {
    var a = [[]];
    for (var i = 0; i < x.length; ++i) {
        var b = [];
        for (var j = 0; j < a.length; ++j) {
            for (var k = 0; k < x[i].length; ++k) {
                b.push(a[j].concat([x[i][k]]));
            }
        }
        a = b;
    }
    return a;
}

Which do you find easier to read and understand, and if you find one easier, what do you think the reasons are?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Recursion

This is one of those instances where the recursive version seems easier to understand: build the permutations (is there a better term for this?) of xs:xss by building the permutations of xss and prepending the possible xs to each.

perm :: [[a]] -> [[a]]
perm [[]]   = [[]]
perm xs:xss = [x:ys | x <- xs, ys <- perm xss]

This is untested but looks much clearer to me. My JavaScript is weak, so I won't attempt to translate this to decide whether the tldr is to use recursion and comprehensions or to use Haskell instead of Javascript.

And to answer your question: I like the second one better.

Liking the second one better?

What about the second version makes it clearer? I have some thoughts, but I want to see what other people think without influencing them.

Gobbledygook

Array.prototype.slice.call(arguments).reduce(...

This is gobbledygook to me. If I knew what each of those words meant, I'm sure it would be clearer, but I'm generally not a fan of this kind of functional style. You have to mentally model each sequence of words. I find it easier to model the comprehension I wrote above in one step rather than to piece together a bunch of higher-order incantations.

JavaScript Weirdness

Well, its a real function from an application I was asked to help with, and I came up with the two possible implementations. If you allow the argument to be an array of arrays, rather than the arrays as a 'varargs' type function you can get rid of the Gobledygook. I will edit the original function definitions because I don't want this bit to get in the way of discussing the different styles.

What I want to understand is how people find the readability and understandability of the imperative vs the high-order functional style.

In Javascript syntax I like the second one better.

Javascript lends itself to the procedural style. Functional style is a poor fit to the language. In a lisp or some other naturally compositional language, it would be clearer in functional style.

Consider the inductive definition of a permutation: prepend one element of the list with the permutation of the rest of the elements of the list. Then translate induction to recursion, and you get:

;; remove one element from a list.  Used to remove the prepended 
;; element when deriving permutations.
(define (remove x lst)
  (cond
   ((null? lst) '())
   ((= x (car lst))(remove x (cdr lst)))
   (else (cons (car lst) (remove x (cdr lst))))))

;; return a list the permutations of lst. 
(define (permute lst)
  (cond
   ((= (length lst) 1)(list lst))
   (else (apply append
		(map(lambda (i) 
		      (map (lambda (j)(cons i j))
			   (permute (remove i lst))))lst)))))

Understanding Intermediate Types

As Lisp is untyped, lets use the term 'structure'. This has the same problem (for me) as the JavaScript functional version, it is not clear what the intermediate types/structures are. For example to know shape of the data structure returned from 'apply append (map (lambda i...)' I have to know the rest of the function, I have to work from the leaves of the AST upward holding the structure in my head. This makes it hard to understand what the code does, and it makes it even harder to just look at a fragment of code and know what it does.

The imperative version has the advantage that we establish the type of the mutable object to be an array of arrays '[[]]', and it remains that for the whole program fragment. Understanding the mutations on an object that remains the same type throughout seems easier for people than constructing the intermediate types as they parse the AST from the bottom up in their heads.

For me I like the elegance of the functional approach when I write it, but I don't like it when I have to read and understand other peoples code (or my own old code). Ironically for functional languages that have read-only data, the code itself seems to be write-only :-)

Cross product

I like he list comprehension syntax, I wonder if it would be even clearer with a relational cross product like operator (*) on arrays:

perm [] = [[]]
perm xs:xss = xs * perm xss

edit: This of course is:

foldl (*) [[]] xss

Is that what the relational

Is that what the relational cross product operator does? I would guess it to be symmetric, using a pair constructor instead of cons.

[a, b]*[c, d] = [(a,c), (a,d), (b,c), (b,d)] ≠ [a:c, a:d, b:c, b:d]

Relational Cross Product

Well it joins two tables, in an extensible way so:

[(a, b), (c, d)] * [(1, 2), (3, 4)] = [(a, b, 1, 2), (a, b, 3, 4), (c, d, 1, 2), (c, d, 3, 4)]

So in a language like JS with no native tuples (or where tuples and arrays are the same thing, which might be a good idea). It obvously works just as well with tuples, the output being an array of tuples instead of an array of arrays.

[[a, b], [c, d]] * [[1, 2], [3, 4]] = [[a, b, 1, 2], [a, b, 3, 4], [c, d, 1, 2], [c, d, 3, 4]]

So it would have to assume the convention that a singleton tuple is just a value.

I am not sure if the distinction between list and tuple is meaningful in a dynamically typed language?

The closest I can get in JS is:

function cross(x, y) {
    var a = [];
    for (var i = 0; i < x.length; ++i) {
        for (var j = 0; j < y.length; ++j) {
            a.push(x[i].concat([y[j]]));
        }
    }
    return a;
}

function permutations3(xss) {
    return xss.reduce(cross, [[]]);
}

Tensor Notation

Tensor notation could also be fairly intuitive and compact, where '++' is concatenation, and <x> is the tensor subscript, then the relational cross product would be something like:

function cross(a, b) { 
   var c<0 * 1> = a<0> ++ b<1>
   return c;
}

function perm(xss) {
    foldl cross [[]] xss;
} 

edit: I have been editing the above to try and represent operation by extending tensor notation to arbitrary operations. I think its now reasonable, as you can imagine the "forall i", "forall j" nested loops,and it would generate the Cartesian product. Numbered indexes might make more sense than letters, which could get confused with variables.

Readability and Understandability

Clearly there are some better notations than are available in JavaScript, but I am still interested in the original question. Which of the two original function definitions seems easier to understand, and why?

Folds aren't very readable for me

I find non-associative folds pretty hard to read in general. The readability gets all tangled up in argument positions, and argument positions already get tangled up with things like Haskell's currying and JavaScript's object access syntax. For a left-to-right fold over a list, I prefer for both argument lists to be in proximity, and I strongly prefer for the accumulated value to be on the left, rolling things up as it goes along:

return _.arrFoldl(state, xs, function(state, x) {
    return state;
});

Still, this doesn't compare to my familiarity with variable assignment, so permutations2() wins for me.

If I were writing this code myself, I would probably use a combined style like this:

function permutations(choicesForPositions) {
    var permsForPrefix = [[]];
    _.arrEach(choicesForPositions, function(choicesForPosition) {
        permsForPrefix = _.arrMappend(permsForPrefix, function(permForPrefix) {
            return _.arrMappend(choicesForPosition, function(choice) {
                return [permForPrefix.concat(choice)];
            });
        });
    });
    return permsForPrefix;
}

It was only after I chose verbose variable names that I realized the strongly typed case is actually when [[['a'], ['b']], [['1'], ['2']]] maps to [['a', '1'], ['a', '2'], ['b', '1'], ['b', '2']]. Your example input [['a', 'b'], ['1', '2']] relies on some weak typing; each string 'a' is effectively coerced to an Array ['a'] as it enters Array#concat(). If I were programming based on this example alone, I would be surprised when I passed in an actual Array and its contents were magically spliced into the output sequences.

Weak Typing

Hmm, good spot. Its only using concat because it creates a new Array for the result. I guess what is really needed is a push that returns a new copy. It was not my intention to rely on weak typing, so I would class this as a bug. I will fix the examples.

Errors should be visible

Programs should be written so that you can tell just by looking at them whether or not they're right. You should be able to tell what they're supposed to do (including both overall algorithm and details), and that if there's a error, you should be able to tell. I find both versions pretty hard to read; as for being able to tell if there's something wrong, although I can't tell off hand whether the functional version has the right algorithm, I do have confidence that it can't suffer from off-by-one errors. That seems like a level-of-abstraction issue, rather than a functional-versus-imperative issue as such. In seriously tackling this from a language-design perspective, I'd be inclined to start by trying to write a really clear pseudo-code-like description of the algorithm, and then ask what sort of language would both  (1) support that syntax and  (2) cause any errors in it to be readily visible.

-

double post

I don't understand your

I don't understand your tensor notation above or how the relational cross product is supposed to solve this problem. Could you elaborate? I can rewrite my solution as a fold like this:

perm = foldl (\yss xs -> [x:ys | x <- xs, ys <- yss]) [[]]

I don't see how that lambda corresponds to cross.

foldl cross working example


function cross(x, y) {
    var a = [];
    for (var i = 0; i < x.length; ++i) {
        for (var j = 0; j < y.length; ++j) {
            a.push(x[i].concat(y[j]));
        }
    }
    return a;
}

function permutations(xss) {
    var xsss = xss.map(function(xs) {
        return xs.map(function(x) {
            return [x];
        });
    });
    return xsss.reduce(cross, [[]]);
}

As for the tensor notation, now I think about it, I am probably abusing the mathematical notation too much, although there is probably something that could be done with some more thought.

Concat is not cons

I'm guessing this works because JavaScript is weakly promoting single values to arrays when you don't pass it an array. This will turn in to a bug when you want to find the permutations of arrays. No?

Concat used like cons

Because it wraps each value in another array, it is effectively doing:

a.concat([b])

which is the same as cons (apart from producing a fresh copy). 'permutations' does the wrapping before calling cross. Without the wrapping with '[]' it does produce incorrect results with permutations of arrays.

The alternative is to define this operation:

function op(x, y) {
    var a = [];
    for (var i = 0; i < x.length; ++i) {
        for (var j = 0; j < y.length; ++j) {
            a.push(x[i].concat([y[j]]));
        }
    }
    return a;
}

function perm(xss) {
    return xss.reduce(op, [[]]);
}

This isn't a cross product, it does exactly the same as the list comprehension now. I am not quite sure what it would be called?

Oh I missed that you're

Oh I missed that you're preprocessing the input to add the required [] wrapper before you call reduce. That takes away a bit from the elegance I think. We could define

pairwise :: (a -> b -> c) -> [a] -> [b] -> [c]
pairwise f xs ys = [f x y | x <- xs, y <- ys]

perm = foldl (pairwise (:)) [[]]

Which looks like it might work until you puzzle out that foldl takes its arguments the other way. Plumbing these functional contraptions together is a pain as soon as you're not pipe-lining single-parameter functions, which is most of why I dislike this style. I'd stick with something like the explicit recursion or maybe something like this:

def perm xss 
   var tot = [[]]
   for xs in xss
      tot = [x:ys | x <- xs, ys <- tot]
   return tot

Language Design

This version works for me, and moving the discussion towards language design, I think I can see some general principles here:

- Keep types of object to a minimum, preferring to mutate an object of fixed type over creating intermediate objects of unknown type, or whose type depends on other types.
- Prefer non scoped blocks that do not capture values (for type blocks over lambdas)
- Prefer declarative value operations like list comprehension over loops or higher order functions where it is simpler (however I have seen people struggle with complex SQL queries, but are happy to write imperative code to do the same thing).

It is also tempting to not allow any free variables in function definitions, so that only local variables and arguments are allowed.

however I have seen people

however I have seen people struggle with complex SQL queries, but are happy to write imperative code to do the same thing

SQL is not compositional, so I don't find this surprising at all. Composing LINQ queries is much easier for example, and complex LINQ queries aren't nearly as difficult as complex SQL queries (although LINQ could be better too).

SQL is compositional

SQL has two syntaxes for joins, the compositional one is:

(subquery...) join (subquery...) on ...

Right, so full SQL is not

Right, so full SQL is not compositional, and it provides no means of query abstraction. Each of those subqueries should in principle permit local bindings:

DECLARE @tmp = (subquery...)
(subquery...) join @tmp on ...

Stored procedures are a kludge compared to proper query abstraction.

Does it need them

Well you can use 'as'?

(select ... as tmp) join tmp

Still even if it didn't support this, that would not make it non compositional. You can have compositional languages with no variables at all.

As far as I know there is no query that can be expressed using the non-compositional (relational calculus like) syntax that cannot be expressed in the compositional (relational algebra like) part.

I don't think this is the problem though. I think its that people lose track of the intermediate results. This is the same as the code examples above, the imperative reduces the intermediate results, minimises the different types (in this case the result row types).

foldl cross product

I agree about error visibility. I think elegantly the left fold of the (relational) cross product is the clearest description:

function perm3(xss) {
    return xss.reduce(cross, [[]]);
}

Cartesian product

'Cartesian product' would be a much better name for this. 'Permutations' is extremely misleading.

I recently wrote this function myself in Python and it was pretty much identical to Matt M's last version, which I requote here:

def perm xss 
   var tot = [[]]
   for xs in xss
      tot = [x:ys | x <- xs, ys <- tot]
   return tot

To your original question: The second one is more readable to me, even after knowing that 'cartesian product' is the intent.

Edit: Why? Probably many little things. Argument order confusion, many variables with similar names, no indication of types, more function calls, no naming conventions to rely on (like ijk being indices). Subscripts instead of element names allow you to see where values come from. In the functional code you have to go through many hops to figure out where say v2 comes from. Javascript syntax. The fact that reduce calls tie together four things at the same time: sequence, initial state, element, next state whereas for loops only tie together three things: index, bound, body. The imperative code starts with the base case then inducts, while the functional does the opposite. In the functional code there is no direct tie between the initial state value of folds and the variable that holds the state. Each line starting with "return" makes the functional code harder to read (but perhaps not if you've deveoped a habit to look past it.) In the imperative code most of the important logic is concentrated on one line whereas in the functional code it's more spread out.

Compare also how Wikipedia mathematicians express the concept. That is probably as clear as it can get.

Functional is supposed to be better?

The mathematical notation is nice, but it does depend on arbitrary convention. To formalise the notation so that it relies on only a small set of extensible syntax/semantics seems hard. A generic comprehension notation seems the best bet, and is similar.

The question is, as we have finite intelligence, when it comes to understanding complex algorithms and programs, are we able to cope with more complexity in imperative languages? Should a language designed to allow programmers to manage the highest levels of complexity therefore be an imperative language?