## Means to Limit or Constrain Side Effects

Simple scenario: I'd like to implement a "copy-string" or "substring" style routine 1) in a language where strings are typically functional; 2) without leaving all such routines to the RTL in another language; 3) without depending on some other inefficient (for these purposes) other RTL functional primitive like "strcat"; or 4) without always relying on using lists as intermediaries for every such low level routine ala "list->string (string->list s)" which again likely relies on non-functional RTL level routines anyway.

Same would go for various array initialization routines, supporting constructors for (otherwise functional) Smalltalk style variable sized data structures, and on and on.

So I seek papers, sample languages and wisdom on potentially clean language mechanisms and their semantics for allowing side effects, say during string/vector initialization as above or in some other possible scenarios (memoization? dunno), without totally letting the non-functional cat out of the bag, as it were.

My interest in this issue is pragmatic (implementing an efficient stdlib/rtl while minimizing code written in a different "lower level" language that would not likely share important properties of the target language) as well as intellectual.

Thanks much!

Scott!

## Comment viewing options

### Mutable Initialization

I'm having some trouble following; your scenario lacks much context or motive. (Are you aiming at performance? semantic side-effects? what? What does 'constructor' or 'initialization' mean to you? Do you expect to see concurrent use of constructed objects? What does it mean to abstract a process or procedure to support your initialization routine?)

Without more specifics, I can only name a few associations to temporarily mutable things that pop into my mind:

• substructural optimizations; if the language can prove that a given region of code holds the only reference to an object (proof can be aided by duplicating a few elements if average gains are greater than dup costs), then the implementation is free to update that object in place, destructively modifying inputs or outputs. Substructural typing (linear typing, region types) is often intended to ensure these optimizations are achieved across module boundaries, but dealing with module boundaries is much stronger than you need for constructor-time optimizations.
• Clojure's 'transients' are explicit, but operate on a similar basis as substructural typing.
• Futures or constraint variables essentially allow concurrent semantics for construction of monotonically immutable objects, i.e. spreading construction over logical time.
• Ruby lets you "freeze" objects after construction, and some people have even implemented "deep freeze" using plenty of reflection. (I do not approve of this approach; it is fragile and not especially secure or parametric. But it does come to mind.)

### Aims and Motives

I *am* aiming at performance if the only RTL functional primitive is to append a single char (elt) to a string (vector). I am aiming for performance and minimization of garbage if the only RTL functional primitive is to construct a list of elements that are then magically turned into a vector/string by that primitive.

Likely not a formal definition, but "initialization" or "construction" means to allocate memory (here sequential cells) and fill it with "suitable" values (according to the type system) as is done in many languages via a wide variety of stdlib functions such as (hypothetically) string-copy, string-concat, list->string, enumerate-vector, substring, make-vector, format (here even building a string incrementally in a growable "buffer") and so on and so forth.

Roughly speaking, I'd like such low level functions to be 1) implementable in the target language with a minimum of RTL level primitive support and 2) as efficient as their C or assembler implementation counter parts.

The goal is (safely) writing such stdlib data structure memory allocating (that's likely a "prim") and element initializing (not a "prim" by design) functions in the target language - and NOT in C or assembler or any other language used to implement low level RTL primitives and facilities where most of the "non-functional" low level primitive stuff is likely to be "dumped" in some other language designs.

Both the substructural typing and "freezing" options look like candidates for further exploration, so thanks much!

Regarding the substructural typing mechanism, is there a simple, canonical or particularly well documented example language that implements this?

Thanks again.

Scott

### Substructural Optimization

Disciplined Disciple Compiler (DDC) for Haskell would be an excellent resource. If you want all the formal details, Benjamin Pierce's Advanced TaPL has a couple chapters on the subjects. Languages such as Mercury (a Prolog descendent) supports these classes of effect typing and Clean (graph rewriting / Functional) has explicit linearity (in the form of 'uniqueness typing').

You don't actually need substructural typing to support the associated optimizations; you only need to identify opportunities that naturally exist in large blocks of code. (On the whole, I'm not too keen on substructural typing because it cannot be cheaply locally enforced in a distributed system. It also can hinder expressiveness [discussed].)

 That said, actually achieving these optimizations is non-trivial. You must compile code properly to destructively modify a region in-place while often keeping enough information for the prior value to influence the new one. This might be highly specialized to each context.

### Issue is sub-arrays

David: I think> the issue here is that he wants to do the initializer within the safe language. Unless the language has some primitive support for subranges, it is likely that he has to first allocate an (initialized) array and then overwrite the designated slots. In consequence, mutable initialization may not help.

### BTW - Constraints on My Inquiry

First, I have no interest in lazy languages or lazy evaluation schemes.

Second, nor do I have any interest in special "functional" data structures conceived as alternatives to old fashioned arrays, strings, variable length portions of structures (as in C and Smalltalk), etc. which are addressable on a typical CISC chip with a single machine instruction (lea on ia32, etc.).

I am interested in any issues that may pertain to multidimensional arrays, just in case I am presently missing some kind of special case considerations for these.

I am also interested in initialization of structure/array elements that may or may not be typed as "Nullable" - *NOT* ML or Haskell style Some/None or Just/Nothing algebraic data structure schemes - just in case that leads to any additional related interesting type system considerations, papers on same, interesting language treatments of this issue regarding array/structure initialization, etc.

Again, thanks much.

Scott

### Optimizations and Models

My interest in optimizing of pure sub-languages has always regarded how to maintain 'pure' models with a minimal set of structural or data primitives, whilst allowing programmers to depend upon some high-utility optimizations.

For example: rather than having a dedicated 'number' type, I would like to model numbers directly. A natural number might be modeled as: type Nat = s(Nat)|z, such that 3 is shorthand for s(s(s(z))). The definitions of sum, multiply, comparison, etc. might be in a standard library, and defined in terms of the Nat structure. However, programmers should know that under the hood these definitions - both the representation and computations - are very heavily optimized.

Similarly, characters might be modeled as Char = u0|u1|u2|...|u10FFFF - basically, a finite set of atoms corresponding to each unicode character. But representations for these types can be squeezed into a 'smallint' range, and operations to test against Unicode character classes should be implemented in a high-performance manner.

And the same goes for the more popular collection types, such as lists. I would like to heavily optimize lists, i.e. by representing them as 'ropes'. By doing so, programmers can have high-performance subrange and append operations, the actual implementations can be very high performance (smaller lists being handled as vectors under-the-hood), and standard operations such as 'foldl' or 'foldr' (or 'map', 'zip', etc.) achieve the same linear performance that programmers would tend to expect from lists, except with a better constant overhead due to locality. This does have a cost: programmers might need to use 'fold' operations explicitly to get the best possible performance (though an optimizer could reasonably be sufficiently smart to identify a few very specific patterns that can be turned into folds, maps, wide-vector zip-apply, etc.). Programmers would get used to that quickly enough.

Similarly, I'd want special optimizations for lists of characters (strings) and lists of booleans (binaries), i.e. by using utf-8 or utf-8z ropes, and binary ropes respectively, and possibly even a bit of compression.

And 'sets' are also a big deal to me. I really want to support 'sets' effectively, since sets scale (in the uber-large) quite well, far better even than lists as ropes - i.e. it is easy to distribute a set across multiple computers, with N-of-M redundancy, based on properties or hash values of the elements. One can perform large operations across sets - map/reduce, aggregation, logical joins, Datalog queries (on a 'set' of propositions). Sets have excellent concurrency and distribution properties, since their value is idempotent on insert (meaning you can duplicate insert messages across disruptions to no ill effect) and order of insert can be forgotten. Sets precisely represent properties of data (a fact spoken twice or spoken late is no less a fact). So, I also want my sets to be heavily optimized as they scale upwards, to make programmers comfortable with using them and. I'm currently pursuing kd-tree and R-tree approaches, where large sets are broken down arbitrarily based on relative attribute values.

'Pure functional' is not quite the ideal platform for these model optimizations because you usually must escape the paradigm before you can express or prove the optimizations. A term rewriting system where terms are modeled with 'evaluation strategies' is a far more promising basis for flexible optimizations across a large variety of models. I've recently been idly pursuing that approach, hoping to resolve TRS with my general inability (for security reasons, and for integration with third-party services and the outside world) to pass models across a network.

But even without an ideal platform, these optimizations can be achieved... and for patterns and standard library functions/types almost as easily as for language primitives. Programmers don't need a lot of special calls or types. They only need to know which 'patterns' of use the compiler knows how to locally analyze and most heavily optimize (such as tail calls, or folds, or append operations). Indeed, 'know thy implementation' is one of the rules of optimization (albeit, well below 'don't optimize yet' and 'profile first').

Anyhow, you claim "no interest in special 'functional' data structures conceived as alternatives to old fashioned arrays, strings, variable length portions of structures." I'm a tad curious as to: why not?

It isn't as though the overall 'runtime' (once you account for caches and memory management) truly has any structures that are 'addressable in a single instruction'. And it isn't as though C's old-fashioned arrays or strings offer especially scalable performance, especially not in a functional setting. In a concurrent setting - including ever more modern applications - you'll eventually discover you want transactional histories or copy-on-write anyway. Offset-based access to lists and strings and such is almost deprecated (as it should be!) in favor of mapping or folding (or parsing, but that's just a specialized fold).

interested in initialization of structure/array elements that may or may not be typed as "Nullable" - *NOT* ML or Haskell style Some/None or Just/Nothing algebraic data structure schemes

That objection doesn't make sense to me.

With regards to initialization of elements, 'nullable' vs. 'maybe' types are pretty much the same. The difference between them regards composition (when you might need to distinguish 'Just Nothing' from 'Just Just Bob'). Even in those cases, you might equate 'Maybe Maybe Person' in Haskell to a safety-enforced variation of Person** in C++.

interested in initialization of structure/array elements that may or may not be typed as "Nullable" - *NOT* ML or Haskell style Some/None or Just/Nothing algebraic data structure schemes
That objection doesn't make sense to me.

With regards to initialization of elements, 'nullable' vs. 'maybe' types are pretty much the same.

I agree. And at the representation level, it is straightforward to ensure that (nullable (ref 'a)) has the same space efficiency as nullable pointers in any other language. There are a variety of issues with nullable references from a "correct programming" perspective, but none I know about from an initialization perspective.

### Clojure's Transients

Transients look promising and more or less mirror my own intended functionality. Closure is a latently typed language, but right now, I can't see that presenting a problem for a typed language implementation of a similar transient style feature.

One deficiency, also a strength, is building vectors via 'conj' without specifying their length in advance. The strength, of course, is guaranteeing initialized vector elements.

I imagine that a "raw" sized vector memory allocation and use of an equivalent of Clojure's "assoc!" (roughly v[x] = y) would require some fancy footwork to guarantee initialization of all vector cells, if possible at all.

Any ideas on compiler level treatment of this latter issue? Or would this be either some kind of runtime check during the call to Clojure's 'persistent!' (or equivalent) or even just allowing for a lapse in safety during vector/array/etc. initialization?

Scott

### Three Possible Paths

Is the issue here mutability or purity? There are a number of languages (including BitC) in which non-escaping mutation is permitted. Not sure if that helps. In any case, I can think of three approaches that might be relevant or may suggest ideas for you:

In BitC, we noticed this as a general problem and introduced a vector initializer:
 MakeVector:pure fn (n : int, L: pure fn int -> 'a) -> vector('a) 
MakeVector is a constructor that calls the lambda form to obtain the initializer at each position, passing the index of the position whose value is to be produced. The type is not quite right, since we want L to be non-mutating rather than pure (it should be allowed to access mutable data structures). This approach can do what you want, but I have come to dislike it.

Because BitC has both arrays (size part of type) and vectors, we introduced something we call (ArrayRef 'a). This is conceptually similar to a by-reference parameter, in that it can only be specified at argument positions (this can, if you like, be allowed at let bindings). The key requirement is things of type by-ref or array-ref do not (by construction) escape past their point of initialization. An ArrayRef amounts to a non-escaping pair consisting of an inner pointer into an existing array or vector and a length. If OverArray is a procedure:
 OverArray : fn (ar : ArrayRef('a)) -> 'b 
then it can be called with any actual parameter of type Array('a, anysize) or Vector('a). This was motivatd, in particular, by things like the "read" primitive in the I/O library, where we did not want to instantiate separately for every possible array size. If we had an array "slicing" construct (which, now that you prompt me to see how to do it, we clearly should), the ArrayRef notion could be straightforwardly extended to allow slices. Concerning this, I'll note that ArrayRef is a hack-ish workaround for the fact that I didn't want to deal properly with region typing. We've since made the decision to do regions, so this aspect of the language may morph into something more general.

The third approach is perhaps the most interesting, which is to note that purity may not be quite what you want. The real problem here is that you want a string which is mutable and then becomes pure at the point of return from the initializer. The problem that you describe is a particular case of a general problem: initialization of rich, arbitrarily connected data structures that become immutable after initialization. The key is to notice that a reference of type 'a:ref can be "cast" to (immutable 'a):ref at the point of initializer return provided that no mutable reference to the returned object is live at the moment of return. It can be cast to (deep immutable 'a) provided that no mutable reference into the graph designated by the returned reference is live at the point of return.

The "live at point of return" part is merely a form of escape analysis. With this in place, the remaining trick is to have a mutable string-like type that can sensibly be cast to string (which is understood to be immutable as a consequence of its primitive type definition).

Hope some of this helps. Feel free to ask questions over on the bitc-dev mailing list if you wish.

### The real problem here is

The real problem here is that you want a string which is mutable and then becomes pure at the point of return from the initializer. The problem that you describe is a particular case of a general problem: initialization of rich, arbitrarily connected data structures that become immutable after initialization. The key is to notice that a reference of type 'a:ref can be "cast" to (immutable 'a):ref at the point of initializer return provided that no mutable reference to the returned object is live at the moment of return. It can be cast to (deep immutable 'a) provided that no mutable reference into the graph designated by the returned reference is live at the point of return.

Thank you for framing my problem and inquiry better and more succinctly than I can.

I think Clojure's transients are a way to avoid such escape analysis and any related type system 'mutable -> immutable' style issues (it's a latently typed language anyway). Transients encase and "delay" all mutations on some other base object - if sometimes only a fresh, empty object - within the otherwise (mostly) useless transient object until the final call to 'persistent!' yielding a new immutable object - upon which the transient object becomes useless.

So far, I think it's really a rather elegant solution given the mechanism's simplicity (save, perhaps, ganging the mutations until the final persistent! call).

Supporting more direct operations on typed mutable structures, especially complex graphs of objects that you mention, that are then safely "cast" to immutable ones seems to require both escape analysis and some kind of support from the type system.

I'm noodling around looking for papers related to this latter option, but only finding articles on escape analysis with different foci and nothing on type system support for this type of language level mechanism.

I'm likely unaware of the proper search terms I should use, so I welcome any pointers to search terms or actual papers that might help in my ongoing inquiry.

Thanks much!

Scott

### Concerning transients

I haven't had a chance to delve deeply into transients, but they look (to me) like they solve a somewhat different problem that includes yours as a subset.

I haven't asked Rich, but I suspect that he avoided regions for many of the same reasons that we did so: they are tricky to implement well, and they can become invasive at interface boundaries. For myself, I can add that I didn't appreciate Cyclone's approach adequately until I had learned a bunch of things from BitC.

We may yet back out, but my feeling in BitC is that region annotations will resolve a bunch of other idiom issues that I'ld kind of like to see resolved. We'll have to see how it goes on that.

### Type system support

Supporting more direct operations on typed mutable structures, especially complex graphs of objects that you mention, that are then safely "cast" to immutable ones seems to require both escape analysis and some kind of support from the type system.

Yes. Further, it's not something you want to infer generally. If you have first-class procedures in your language, you probably want at least a restricted form of region analysis in your type system in any case in order to perform accurate escape analysis. This builds on the same analysis with minor revisions.

The other half of the trick is to notice that immutability is a liveness property. In general, a coercion from a read-only or writeable reference to an immutable reference is not legal unless you can show that no writable reference is live. In the presence of concurrency, that's usually impossible. But in this very special case of "generalized initializers" there is a magic place at which you can easily know: the return point. Strictly speaking, this is a property at the intersecting boundary of type and liveness analysis.

I noticed the possibility of this approach to immutability coercion about three years ago, and I actually haven't seen it elsewhere in the literature. Perhaps somebody here will have a pointer to relevant prior work/research.

One catch: introducing this allows you to build pure cyclic structures. If you aren't careful, this can break your ability to reason about termination by downward induction on constructors. That can be recovered straightforwardly by another typing trick if needed.

### Is the problem well-motivated?

I wonder whether it is essential to implement core types like strings in the safe language. Perhaps one can achieve application-level goals with a library that makes judicious use of unsafe code for some commonly-used performance-sensitive types (such as strings). I suspect it might work to write application code in a safe language, linked against a library that is partially implemented in an unsafe language (with the unsafe parts subject to rigorous quality control: e.g., careful testing and code review). I know it's less elegant, but pragmatically, this seems like it might be more achievable. What do you think?

### Depends on Goals

I think it depends on your goals. Also, go look at my statement of the general problem (above at Three Possible Paths). My sense is that one wants to deal with that issue, and this case comes along more or less for free once you do.

While you can always trade back and forth between the unsafe runtime and the safe runtime, the less stuff you have to audit the better, both from a security/safety standpoint and from a portability standpoint. One of the things I consider "promising" about BitC is just how small the C portion of the standard library is. It's clearly going to grow (because of I/O), but functions exist in C for only a short list of reasons:

• Efficiency. Our decision to use utf-8 encoded strings makes indexed access inefficient. I need to go back and re-examine that decision, but that's an entire class. Even so, there are surprisingly few of these.
• Access to inline assembly. Only example: rdtsc, and this one doesn't belong in the standard library at all.
• The stdio wrapper functions.
• The GC, which is currently a placeholder
• The low-level implementation of MakeProcedureObject, which is special because it allocates a heap-resident code object.

There are also some inline functions corresponding to primitive ops (add, subtract, and the like). These appear in a single header file, and much of the complexity of that file comes from the fact that we did not use C++ as our target language (and consequently gave up overloading).

Sum total C code in the runtime (as of today, anyway): 1549 significant (per sloccount) of 2860 textual lines. Implementing the GC, of course, will likely increase this.:-)

And yes, so far, it's a toy runtime. We didn't want to invest until we had a final surface syntax. And yes, it will certainly grow over time. But it continues to pleasantly surprise me just how much of the BitC library can be implemented in BitC itself, and how this contrasts with my experience working on the innards of Franz LISP and Tiny Scheme. Too early to draw any real conclusions, but for lack of a better way to express matters, it "feels" promising.

### Different goals

We're not supposed to discuss design, so I'll just summarize that I've always been a fan of "small core" languages - many Forths, Smalltalk, some Lisps, etc. I'm exploring language concepts in my spare time (all my time is "spare" as I'm currently chronically ill) via implementation of a strict, "kinda ML'ish + PLT Scheme'ish" language.

It has a few odd features and implementation constraints by design: unsafe native operations, user level safety can be added via "require/ensure", anonymous record operations modeled after a paper by Cardelli, a very small object system modeled after PLT's, a simple module system, compiled to assembler (NASM) and assembled to regular object files and linked with the "normal" linker, as small a runtime as possible in NASM and a little C, very convenient FFI support (high priority item), inline assembler with macro level support to help honor runtime constraints, blah blah blah.

My own interests are then only really peaked by what I can then ADD to such a core language - both features and RTS.

Anyway, I guess one can say that it's an overt goal of mine to keep this ML'ish language otherwise fairly (very?) low level, as friendly to the machine + os + external libraries as possible and with only a minimal runtime.

Still a work in progress, but I can imagine even making the runtime less than performance optimal in order to reduce/simplify the constraints on the target language's various "low level" or "machine/os environment friendly" facilities - but we'll see.

Scott