Nomenclature for impure functions

The Wikipedia article about pure functions explains that they follow two rules:

  • The function always evaluates the same result value given the same argument value(s). The function result value cannot depend on any hidden information or state that may change as program execution proceeds or between different executions of the program, nor can it depend on any external input from I/O devices.
  • Evaluation of the result does not cause any semantically observable side effect or output, such as mutation of mutable objects or output to I/O devices.

I would like to know if there is an established nomenclature for impure functions that break only one of these rules. That is, how do I call a function that depends on some external state (like a GetCurrentDate() function) and another that has side effects (like printf)?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

"Side-effect free"?

Maybe "side-effect free" for GetCurrentDate()?

GCC lets you annotate functions as "const" (which means pure; i.e. no reading globals) and "pure" (which means side-effect free).

What is a procedure?

In mathematics, functions are modeled as sets of (input,output) tuples. For example, {(0,1),(1,2),(2,3),...} is the f(x)=x+1 function on the natural numbers. This provides a clear picture of what a function actually is. Specifically, a function is not a formula, there are many functions that don't have a corresponding formula.

Can we give a similar characterization of a procedure? What about procedures in a language with concurrency?

Procedure as Strategy

I haven't read it but the notion of a procedure as a proof via game semantics is presented here.

Languages with concurrency semantics can be modeled in mathematics as describing a set of legal behaviors.

The bigger problem is languages without a concurrency model, where developers commonly hack in concurrency features later because they were suffering, or end up having concurrency anyway to interact with the outside world. In these cases, you don't even have a concurrency semantics to model.

A useful definition in my

A useful definition in my opinion is that with a procedure you proceed through a sequence of operations over time.

Thus the name 'procedure'.

Reader vs. Writer Monad

In monadic terms, the "only depend on the environment" would be a Reader monad, while the "only affect the environment" would be the Writer monad. It doesn't say it all, because you have to specify what's the state carried by those monads is, and here it would be a "World state" a la IO monad (or Clean World).

Lot's of impurity

Sorry, but I don't know the nomenclature, but if a pure function must always return the same value from the same input then
1) any memory allocation can make it impure.
For the heap allocation, that's obvious, but stack allocation can fail too..
2) should writing into a logfile be considered as impure?
That's not so clear: if the programmer understand that the compiler may remove some function call (memoization), then I don't see why it should be forbidden (even though this prevent compile time evaluation of the function).

Infinite Memory

for heap allocation, that's obvious, but stack allocation can fail too..

Most languages do not provide failure semantics for allocation. That is, a failure to allocate would be a violation of abstraction. In this case, the function wouldn't 'return', but rather the error should be shifted to a semantic operation whose abstraction allows runtime errors. (If no such point exists, you might end up dropping the whole program.)

should writing into a logfile be considered as impure?

It's certainly an effect that can be used to communicate with external systems. From a security standpoint, that's very significant. Further, it may have consequences on synchronization. I'd say this certainly isn't 'pure'. I'd rather not start another dialog on whether it should be 'forbidden'.

Control effects

A function can also be impure due to control effects, e.g. exceptions or call/cc. Even non-termination is regarded an effect in many contexts. So there aren't just "two sides" to purity, as the (pretty bad) Wikipedia article might suggest.

A more universal criterium for purity is obliviousness to evaluation order: roughly, a function - or generally an expression - is pure if it never matters when and in which order (or how often) it is evaluated.

Technically, the details of "matters" depend on what exactly you consider possible observations for the language in question. For example, allocation is not typically considered observable in a sufficiently high-level language. A more subtle example is thread creation in a concurrent language (the control flow equivalent of allocation).

My language supports

My language supports exceptions, but they don't affect the rules of optimization. As briefly explained in my comment below, I will have a syntax to distinguish calls with effects that "matters" from those that are just implementation details.

Allocation and Temporal Effects

While most languages don't acknowledge the use of space and logical time resources as 'effects', there are some that do. Linear programming or region inference can allow us to capture spatial semantics. Modeling logical 'delay' is often useful for managing complex parallel pipelines.

You can understand monadic effects as being implicitly temporal because there is a 'before' and 'after' within the monadic context.

Classification of procedure calls

Thanks for all the replies! But it seems no such nomenclature exists!

As a programming language designer and compiler writer, I'm working on a classification of procedures to know what transformations to their calls are allowed by their semantics. (I called them functions in the C tradition, but indeed, I should call them procedures.) My language aims to be purely functional by default but allows side effects with easy annotations.

Right now, I have the following classification:

  1. Pure functions
    Calls to procedures that implement functions in the mathematical sense (that is, pure functions) can be reordered and/or executed in parallel at will. Also, redundant calls (with the same input) can be merged.
  2. Unordered effects
    Some calls depend on and/or modify some state that is not important to the caller since it's just an implementation detail. For example, malloc modifies the heap but the caller is simply interested in obtaining a pointer to valid memory, regardless of the initial and final state of the heap. Those calls can be reordered. They can also be executed in parallel if the implementation is thread-safe. However, they must not be merged! (Malloc returns a different pointer every time it's called...)
  3. Ordered effects
    1. Some calls depend on some state and the very purpose of the call is to get this state. They must not be modified in any way. For example, a call to GetCurrentTime() must not be reordered, especially if it's used to measure the code performance!
    2. Similarly, some calls are made with the sole purpose of creating side effects. They also must not be modified. For example, calls to printf usually require to keep their ordering to make sense!

This classification does not separate "observed state" from "generated effects" but it seems sufficient for my purpose.

Idempotence

At the moment you're classifying effects as ordered or unordered (aka commutative). Another important classification is 'idempotence' - which means that executing twice in a row has no additional effect. A behavior that is both commutative and idempotent allows elimination of duplicates anywhere in the group.

Further, you can classify effects by their lifetime - i.e. state vs. event. There are many benefits for using commutative, idempotent, RESTful effects.

We couldn't find any nonclementure either

We were looking at a similar problem in the CAO compiler recently. In the end we picked Pure Functions, Read-Only Procedures and Procedures as somewhat arbitrary names. The third category allowing both read and write operations on an external environment (i.e no distinction of procedures that only affect the environment). The motivation was broadly similar in that we wanted the language semantics to specify how the optimisations could rewrite calls. Knowing which subsets of the environment are modified is also useful for a parallel assignment construct.

Implicit Arguments

Think about how you would stop violating the first rule -- you would need to pass an additional argument.

So perhaps you can consider those kinds of functions to have implicit arguments.