Functional random numbers without threading state

I was thinking about this problem the other day and came up with what I think is a tenable solution. It's quite possible someone else has thought of this before but I haven't seen it.

The idea is very simple: provide the programmer with a function random(x), which deterministically maps real numbers to random values (e.g. a white noise function). Within any function definition (or other code unit), the programmer obtains random numbers by calling the function with different arguments, e.g.:

let my_random_data = [random(0), random(1), random(2)]
-> [0.53, 0.12, 0.79]

Because random(x) is deterministic, these expressions can be reused at will, returning their original results:

let my_random_data = [random(0), random(1), random(2), random(0), random(1), random(2)]
-> [0.53, 0.12, 0.79, 0.53, 0.12, 0.79]

Now, to enable modular use of this function, we can simply derive from it a new function, random'(y), whose domain is again the real numbers, but maps to an interval in the domain of random(x), say 0<x<1:

let random'(y) = random(atan(y) / pi + 0.5)

Now another function or module can use random'(y) in the same manner as the main code used random(x), without fear of duplicating the random sequence. A new derived function can be created for each module which requires its own random function, simply by choosing a different interval of random(x) to map into:

let random''(z) = random(atan(z) / pi + 1.5)
let random'''(a) = random(atan(a) / pi + 2.5)

This process can be repeated within sub-modules, etc.

Of course, you can simplify this pattern using a higher-order function, which I will call "delve":

let delve(random, index) = (x) -> random(atan(x) / pi + index + 0.5)

So an example program might look like:

let create_random_data(random') = [random'(0), random'(1), random'(2)]

let my_random_data =
create_random_data(delve(random, 0)) ::
create_random_data(delve(random, 1)) ::
create_random_data(delve(random, 2))

-> [0.53, 0.12, 0.79, 0.16, 0.08, 0.92, 0.65, 0.50, 0.01]

I wrote an OCaml module implementing roughly this interface, albeit over integers, which is available here (interface). It reuses the built-in sequential random number generator, and is optimized for the case when the random function is accessed sequentially. (It must "replay" the random sequence if it is accessed otherwise.) The "delve" function is implemented by using the value from the given random sequence as a seed for a new random sequence.

Thoughts? Improvements?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

It should work great as long

It should work great as long as for each expression you know statically the indexes you should use. But try your functions in a program that needs to choose random numbers inside of a recursive call, and I bet you'll find yourself passing indexes to keep from getting repeats: either indexes for "delve" or indexes for "random".

I've also had an aversion to the random monad since I first learned it, but it was only recently that I figured out why. It's because a monad is too strong a notion. You can get the same effect with an applicative functor, threading an infinite binary tree. Mathematically, it's no more difficult to put a probability measure on an infinitely deep tree than on an infinite list. For an implementation, I suppose you'd want some kind of branching pseudorandom number generator, or use two of them on the same input to branch the state. (Disclaimer: I have no idea whether mixing existing PRNGs is a good idea. It might result in bad randomness/cryptographic properties if done wrongly.)

Anyway, the idea is that using randomness doesn't need to sequence your applicative expressions like using a monad does.

You're right about carrying

You're right about carrying an index through recursive calls. At least, that's no worse than threading state.

I like the infinite tree idea as well. Having such a PRNG would make implementing random/delve even simpler -- random would walk down the left children, and delve would step to a right child.

Agreed on the cryptographic comments -- I'm sure there's something wrong with seeding the PRNG with its own output (as I do in my implementation), but I don't have the crypto background to say what.

Entropy

You are reducing the entropy in the sequence by reseeding with the output. It would take some serious effort to analyse what you are doing properly, so take this with a pinch of salt, but the separate streams will correlate. It may look difficult to spot the correlation looking at a couple of decimal places over a short run, but you've just made breaking the seed / predicting the next output easier. I suspect that for most PRNGs the other effect would be to reduce the length of the orbit.

Cobbled Together

Couldn't agree more. It looks like a band aid solution to me. In other words.

1. You do thread state, it's just another state than the seed/number one would thread in a real PRNG solution.

2. Don't expect hard random results. Good PRNGs are good PRNGs because they satisfy certain properties (probably). Just 'seeding' in a different manner will break those properties (probably).

3. As a stop gap solution it could make sense in a number of pure functional programs, but you'ld need to consider the above two.

Reseeding is irrelevant

I know there are better ways to implement the proposed interface (e.g. forkable PRNGs, as mentioned elsewhere). Do you have any issues with the interface, as opposed to the implementation?

I'm also not clear what you mean when you say my interface is threading state; I believe state threading is defined as a coding pattern whereby a state must be passed to a function, which returns a new state which must be used in the next (lexical) call to the function. Although you could argue this happens with delve() (in that it returns a new state as a closure), the use pattern I outlined above restricts state passing to follow the logical structure of the code, rather than its lexical structure. To make the difference clear, the proposed interface changes this:

let create_object state =
  let x, state' = random state in
  let y, state'' = random state' in
  { x = x; y = y }, state''

let create_game state =
  let player, state' = create_object state in
  let opponent, state'' = create_object state' in
  { player = player; opponent = opponent }, state''

into this:

let create_object random =
  { x = random 0; y = random 1 }

let create_game random =
  { player = delve random 0; opponent = delve random 1 }

I think we can agree that's an improvement.

Applicative isn't really applicable here

Anyway, the idea is that using randomness doesn't need to sequence your applicative expressions like using a monad does.

You're correct that the enforced sequencing is spurious and undesirable, but using an applicative functor instead of a monad doesn't avoid it. Both mix monoidal structure with the functor, which is where the sequencing comes from. The difference is that monads are asymmetric, giving a notion of causality that permits embedding control flow into the functor; whereas applicatives are symmetric, and so sequence computations independently of the values in the functor.

In either case, if you're combining two computations such that generating random numbers in the first impacts random numbers generated in the second, you're going to be stuck with either threading state somehow and the spurious sequencing that implies, or using impure functions to get nondeterministic values from the outside world.

Alternatively, if you have a PRNG with good splitting behavior, you can fork the seed for each combination so that the only sequencing imposed is parent-to-child, following the tree of function calls. This still adds some spurious junk (e.g., unnecessary splitting to provide seeds to computations that never generate random numbers) but doesn't unnecessarily sequence functions that merely use random values.

The splitting approach ought to work as either a monad or an applicative functor, with the usual distinction that with the latter, the tree of seed splits must be independent of the random values generated.

The downside is that, as far as I know, PRNGs that are well-behaved under repeated splitting are an area of active research, and warrant extreme caution if you need high-quality pseudo-randomness (e.g., for crypto).