Can we have a polite, orderly and short discussion of the political, social, institutional aspects of the Swift language? This thread is the place for it.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

PL designers will be in

PL designers will be in vogue again and grab some attention of the vast attention going to the deep learning people....for a couple of days. There will be more competition in the PL space, hopefully driving more innovation, and more employment oppurtunities form said PL designers.

Apple seriously wants their ecosystem to be special, to distinguish itself from android, lock in developers, and prevent code from being shared between the two. It will be interesting to see what google does, as Java is old in the tooth and neither dart nor go are suitable as app languages. Maybe they'll adopt mono-C# :) JavaScript/web is sufficiently inexpressive to matter much in the app space.

Can languages really lock in

Can languages really lock in developers? Can't languages be compiled to any target? Discuss.

Who is going to bother with

Who is going to bother with a swift dev support (more than just an output target) for their platform though? Well, I can see google maybe doing it, especially given that this puts them in a spot right now.

Mono + Linux

Although you can run Mono on Linux, it is only the core .Net platform, without all the useful libraries. As such it only offers portability in name but not in practice. Its like all the Linux Mono apps will run on Windows, but not the other way around.

Platform already does the lock in

People writing iOS apps are already locked in by Cocoa, etc. Besides, Objective-C might as well be Apple's own private language anyway. I don't think that a new language is creating any new/additional/stronger lock-in.

Multi-platform applications

For some reason some developers get away targeting multiple platforms. Honestly, I have no idea how they do that.

C or C++?

C or C++?


I guesstimated C++ for a particular software house and looking at their hiring positions that seems to be correct.

cheap labor negates lock in

Swift is designed for use by cheap labor. It is designed in a way that I think most working programmers could pick it up well enough to start using it almost immediately. The quantity of code in a typical "app" is not very large. "Lock-in" can't function here.

By contrast, a commitment to use (say) Haskell or Lisp can create lock-in because there is a scarcity of qualified labor competent in those languages available to some sectors and because those languages are meant to be used in larger systems. If your firm is relying on a very large lisp program, for example, you are pretty much locked in to an expensive labor market.

Apple's quirky language, in combination with other factors like developer fees and non-standard APIs -- these can all add to the marginal cost of writing for or porting to Apple's platforms. See my other comment about "Apple seeks to raise devlopment costs".

api licensing

I think de-facto lock-in would not be difficult. There is already recent legal precedent for copyrighting of APIs. And what is a standard set of libraries, if not an API? And who knows what will happen if the programming language itself becomes considered a creative design suitable for copyright (certainly, one could develop a convincing argument for that).

Apple hasn't yet revealed its licensing plans for Swift.

api licensing

Let's not count that as a precedent yet. There is still a lot of legal fighting to be undertaken yet. If it does eventually hit SCOTUS, it may well be another kick in the head for CAFC.

The language itself can be copyrighted. There are a number of those already. But as has been found previously, copyrighting a language limits your user base to fairly specific niche markets.

For more general use languages, copyrighting will only work if you user base is significantly large enough for you to have the upper hand. Feedback I have seen (in other fora) has seen a marked increase in moving from Java and the JVM because of the legal fight over whether or not an API is copyrightable. This migration may only be small yet but if the right people get involved then entire companies can be changed as to the environment that will be used for development.

Lock in

I think it will be very locked in in practice. Apple will probably open source the basics of Swift, same as Objective C is part of GCC. At the same time though... Apple is tying the language tightly to XCode for development example playgrounds. They are going to tie the language tightly to Cocoa libraries, "seamless access to existing Cocoa frameworks.” Most large Swift applications will probably have some Obj-C mixed in with even greater platform dependencies.

I suspect that Gnustep (http://www.gnustep.org) will port Swift over but the port at best will be effective for code at something like a decade behind the proprietary version and not support much new code in practice. Fine for academia (which is what LtU focus on) but for commercial Swift will never move.


Why isn't Dart suitable as an app language? It seems to be the existing language most similar to Swift. Is it a matter of libraries and focus? Isn't their focus large-scale web apps?

Dart comes from the

Dart comes from the JavaScript/web perspective, and it would have to grow another app oriented ecosystem that wouldn't resemble its web ecosystem. Related, Microsoft tried to make JavaScript an app language in WinRT, but it hasn't really caught on. So it is possible, but it's not clear that is the best approach.

Robert Harper causes a storm

Robert Harper caused a little storm on Twitter, saying that "real" PL experts should be hired. He was accused of applying an ad hominen (which I don't think this is) and apologized. I think his claim has deeper truth than that, though. Whether language design is driven by attempts to push the envelope and advance the state of the art or by other considerations is surely an issue. I think the emphasis should be put on what Harper calls "remaking the old mistakes" rather than the less than optimal use of the phrase "real experts".

What mistakes?

I'm be curious to hear what he believes the actual mistakes are, so that a reasonable design discussion can be had. One problem with Twitter is that complete agreement requires only a single bit: Click "favorite". Meanwhile, even the smallest disagreement often requires much more than 140 characters to communicate.

I'd be surprised if he

I'd be surprised if he didn't think the whole OO thing was a mistake. I can't find too many novice mistakes with the system.

Can't undo that "mistake" now

The entire Apple software stack depends on it.

Vendor languages are a problem

I am not a fan of vendor specific languages. I have problems with the idea the universities could teach these languages, and become part of the marketing machine of a corporation. (Public money should not be used to favour one competitor over another - and corporate funding of education seems wrong, as learning should be an end to itself, in the tradition of universities). I am certainly not against vendors running their own courses on their languages using their own resources.

It also results in wasted effort as C#/Java/Swift are all looking at the same problem domain.

When academia had similar problems with the proliferation of lazy pure functional language prototypes, the solution was to all get together as a committee and write one language for everyone (Haskell). Is it time for industry to so the same?

Finally I am not a fan of languages designed by committee as they tend to be big and unwieldy (Haskell, Ada etc... actually Haskell is okay as they only accepted the really well established common bits, is it the exception that proves the rule?). Some of the neatest languages have been the work of a few people ML / C / Pascal.

So what we need is a small team of people who are platform / OS neutral to produce a Java/C#/Swift type language, and release it as an open-standard and persuade all the OS and system vendors to adopt it.

Ada was not designed by a committee

Ada '83 was designed by Jean Ichbiah, and Ada '95 was designed by Tucker Taft. Of course they had input from other people as well. The Steelman requirements to which Ada (and three other programming languages) had to conform were, on the other hand, created by a committee.

I was thinking of the requirements

It seems I was thinking of the requirements...

You made me choke in my coffee

When academia had similar problems with the proliferation of lazy pure functional language prototypes, the solution was to all get together as a committee and write one language for everyone (Haskell).

Yah. Okay.

So what we need is a small team of people who are platform / OS neutral to produce a Java/C#/Swift type language, and release it as an open-standard and persuade all the OS and system vendors to adopt it.

Well. I completely disagree. What we need is a healthy ecosystem.


So what we need is a small team of people who are platform / OS neutral to produce a Java/C#/Swift type language, and release it as an open-standard and persuade all the OS and system vendors to adopt it.

Java is an open-standard and platform independent. That being said, why would OS and system vendors (mostly) want to do that? What's in it for them. Apple's programming languages exist to help sell Apple hardware. Microsoft programming languages exist to sell Microsoft OSes and server products. IBM vertical programming languages exist to sell expensive IBM solutions especially consulting and management solutions. Etc...

Java controlled by license

Java is not open. See Dalvik and Google implementing not-Java.

Languages do not exist to promote vendors products although vendors may wish they did.

I do not want to have to reimplement my software in different languages for different platforms. It's hard enough dealing with platform differences. In a free market for hardware software needs to run on all platforms. In a free market for software all hardware needs to run it. I think trying to control both is close to anticompetitive practice, and probably should be regulated. Like in the UK you cannot own both trains and the tracks, or supply electricity and own the wires.



You had commented about, "persuade all the OS and system vendors to adopt it". The advantage you give above is that it assists in the commodification hardware and OSes. That's precisely right. Being commodified is something that vendors try and avoid not something they try to promote since it results in competition mainly on price. Platform vendors want software to be a differentiator they don't want a situation where all hardware runs all software. You see exactly that situation in the Wintel market and the result was a complete collapse of margins leading to a diminution of quality leading to collapse in price i.e. further loss of profits. That's a disaster from platform makers perspective. The advantage you are presenting is a disadvantage for them.

If you are Google, that is if the goal is to sell advertising not hardware and OSes then commoditization the platform tis a good thing. But that's precisely because their interests lie in collapsing the margins on their platform. Now in terms of regulation it may or may not be in the government's interest to regulate platforms and opensystems. I tend to think the exact opposite as you do. Regulations that require platform X's software on platform Y require either that X implement all of Y's standards or that X not innovate. Obviously you can't have a situation with more than a very small number of vendors who each are required by law to implement each other innovations unless the innovations are regulated out of existence, the latter case.

I think platform advancements are vastly more important than software portability. In a situation where platforms are highly differentiated all that exists in the industry is that software has to undergo a complex migration process to cross over platforms. Which means that within each platform you have a "free market" (as you were using the term) and then another "free market" for platforms where the software available becomes part of the decision process. That IMHO is a net positive because there are tradeoffs between platforms. A good example of this is the collapse of the middle of the market in tablets. Where we now have 2 highly differentiated platforms:

Android offering a rapidly advancing set of functionality for highly price sensitive customers aimed primarily at video playback (i.e. an Android tablet is mostly a handheld television that also runs some software)

Apple iPad offering a rapidly advancing set of functionality for semi-price insensitive customers aimed primarily at replacing laptops (i.e. an Apple iPad is mostly a low end laptop with excellent battery life optimized for short interactions)

Taking that example why should we expect those two customer bases to mostly even want the same software? Most software is either going to be targeted towards people who want high service levels (semi-price insensitive) or good pricing (price sensitive). In a free market we wouldn't want and shouldn't expect much cross over.

Funny, I use my iPad more

Funny, I use my iPad more than my laptop now, which I still use a lot of. Replace "short term" interactions with browsing, and you might have something accurate. It's nothing like a laptop unless one is crazy enough to throw a keyboard on it.

Is there a tablet that is even cheaper than the iPad for the same feature set? Being semi price insensitive has nothing to do it when everything else is junk.

Use iPad more


Replace "short term" interactions with browsing, and you might have something accurate. It's nothing like a laptop unless one is crazy enough to throw a keyboard on it.

Usage numbers show that people are replacing computer interactions with iPad interactions. Usage is decreasing for laptops and desktops because of replacement. Browsing is obviously a terrific example of something people can do on both devices. Video conferencing especially distributing meeting documents and in meeting annotations is another. Photo organization...

Is there a tablet that is even cheaper than the iPad for the same feature set?

If you mean hardware Google nexus tablets: http://www.google.com/intl/all/nexus/tablets/ are a good example. But in general no. That was my point. That Android tablets are thriving at the $75-150 price point with a much lower feature set. The tablet market at this point has effectively segmented into two distinct products aimed at two distinct user bases with striking different preferences.

Being semi price insensitive has nothing to do it when everything else is junk.

Saying I'm willing to pay much more for a superior experience is being semi price insensitive.

Usage decreases for laptops

Usage decreases for laptops as what many were doing with laptops was more appropriate for tablets. They were not laptop interactions in the first place, we just had nothing better to do them with before. Turns out, we are mostly a passive consumption-based being, which is fine.

Poor people remain poor because they are forced to by junk that has to be replaced more often. In the long run, they are actually losing money, so buying something nice and sturdy is actually more economical.

You make it sound hard.

But its easy, just everyone develop and share the language between vendors, and put cool stuff in libraries.

I am not asking for lowest common denominator, I am asking that business logic be shared between platforms.

Just a common language. No need for identical APIs, but of course stuff that everyone had like basic file access would be handy if it could be standardised.

A further thought is that if software vendors interests are for a single application source-code (which I think they obviously are), then which hardware vendor aligns best with their interests? It would seem to be Google. In that case apple had a window of opportunity to lock the market in before android becomes good enough for the average user.

I don't get the app divergence thing. Enterprises want mobile access to their systems, so the enterprise determines the required functionality and it gets implemented on whatever hardware, or if BYOD all popular hardware.

Most people regard fragmentation of Android as a bad thing, so fragmentation of the mobile market is the same, a bad thing. I don't think you can say one is bad and not the other.


Isn't JavaScript our lowest common denominator language? It runs pretty much everywhere and people use it *because* it runs everywhere, despite its flaws.

Doesn't have full access

The comment about making it sound hard was in reference to the argument that somehow improving platform features makes cross platform compatibility impossible. This is of course a false dichotomy, one does not preclude the other.

JavaScript does not give access to all the platform and OS features, and does not allow a 'native-app' feel. I don't want lowest common denominator, I want full access to platform features and cross platform development from a single language. So you can use an iOS date picker on iOS, and an Android one on Android. However when you consider that JavaScript development is cheaper than native, I do ask myself why bother with native when most of the apps do not need any native platform features. Many apps are now actually hybrid apps with a minimal platform native shell, and content in HTML/JavaScript/CSS, using an embedded WebView component. Apple tries to make this hard though by not allowing easy calls from JavaScrpt though to native code. There are some hacky solutions using URLs that are used by various products, but just allowing native code to register JavaScript callbacks like Android does would be much neater.

Another alternative to JavaScript is C++ with a wrapper for iOS and NDK on Android. Using the unix networking and file access and Open-GL ES graphics you can get some way towards cross platform mobile with good performance.

Mostly Android and iOS versions are implemented separately following the individual platform UX guidelines, from a single master design generated by iterating the UX with the client during requirements capture.

Dynamic Binding

It seems clear that C# and Swift both try and embed the platform dynamic binding into the language so that plumbing of such things can be hidden from the language user. There are dynamic scripting languages like JavaScipt and Python that also hide this kind of dynamic plumbing and can be used as part of an app on iOS and Android via an embedded interpreter.

There does seem to be a lack of an independent compiled language with support for dynamic binding of objects. However even if there was an open language similar to C# and Swift, unless the platform vendors support it, it seems unlikely to get any traction.

Regulation and Free Markets

You clearly misunderstand the regulation aspect. Of course you would not regulate that vendors implement each others innovations, that would be ridiculous.

The regulation would be that you cannot own a controlling interest in both a hardware platform and a programming language (or something like this). Apple could easily comply by getting Swift ISO standardised, and relinquishing control to the standardisation body. I am sure some suitable regulation could be developed by regulatory experts that is much better than my amateur musings.

I also don't see how your use of the free-market concept works. In free market, the user who has any hardware device, is free to chose any software they like, unconstrained by artificial restrictions on that market, whether they be imposed by government or vendor (Apple). Also the user of any given software is free to chose any hardware (without the software license preventing change of platform). Your use of the terms just does not make sense. Its like saying "Freedom is slavery". In your case saying software choice is restricted by your choice of hardware platform, or your hardware platform is restricted by your choice of software = freedom? Traditionally hardware choice is driven by software, so I can see the attraction of making certain software only available for your platform, and it is clearly in the vendors interest, but it is also clearly not in the interest of the end user, nor the independent software developer.


The regulation would be that you cannot own a controlling interest in both a hardware platform and a programming language (or something like this). Apple could easily comply by getting Swift ISO standardised, and relinquishing control to the standardisation body. I am sure some suitable regulation could be developed by regulatory experts that is much better than my amateur musings.

Objective-C isn't owned by Apple. Objective-C is incorporated into both LLVM and GCC. It is Cocoa that Apple owns. I'd expect the licensing is going to be the same for Swift. Apple isn't going to own Swift de-jure just de-facto.

. In your case saying software choice is restricted by your choice of hardware platform, or your hardware platform is restricted by your choice of software = freedom?

Well yes. You choose the function then the software you want to run and then choose appropriate hardware to run it. The freemarket is allowing the person to choose freely from available products. Software designed to run on a supercomputing cluster cannot be made to run on a coffee maker. And in the reverse direction a supercomputing cluster has no way to boil water. PBX software is going to require hardware that can hear and generate channel associated signaling. There is no way to avoid a tie between function software hardware.

and it is clearly in the vendors interest, but it is also clearly not in the interest of the end user, nor the independent software developer.

I don't think that's clear at all. End users seem to benefit quite strongly from more heavily integrated systems and when given the choice mostly choose vertical integration of de-integrated component systems. There is a reason for that.

As for not in the interests of ISV's again I don't see it. The vast majority of ISV's exist because of vertical integration drops the cost of their little add-on enough to make it possible to develop and market their products. The SharePoint ecosystem wouldn't exist with SharePoint. The flood of mobile applications exist because Apple, Google and to some extent Microsoft and Blackberry created rich vertical ecosystem. We don't have a rich and diverse collection of software to run on interstellar transport devices because there are no interstellar transport devices for them to run on.

wrong approach

This assumes developers exist to create SharePoint applications. This is wrong. Developers (independent or otherwise) exist to service the business needs of their clients. What I care about is the increased cost of maintaining multiple versions of software on different platforms. I also care about getting significant market share for products in a fragmented market that necessitates multiple versions. You seem to have no answer for these real world problems. Talk of market theory, does not help my bottom line. Do you have any practical suggestions to help businesses in this market?

I disagree about the tie between hardware and sourcecode. I can write software that will scale from a phone to a supercomputing cluster using unix/POSIX C++ and MPI that just requires recompiling for each platform. The PBX point seems inaccurate, we run VOIP linux PBXs here with no special hardware at all in the server they are just regular linux VMs in our rack.

Freedom to create restriction is not freedom, this is the Russell paradox incarnate. Giving apple the freedom to restrict peoples software choice is akin to giveing slavers the freedom to make people work for nothing (a bit dramatic I admit, but it is the same application of misusing the concept of freedom in a restrictive sense). As such restricting Apple's ability to lock people into the platform is increasing freedom (freedom to restrict is not freedom, but restricting a restriction is). It would appear this is the understanding behind things like the unlocking legislation, that restricts carriers from preventing people unlocking their phones, giving freedom to change carrier to the end user. How is freedom to change carrier without changing phone philosophically different from changing phone without changing software?

If apple does not own swift, who controls the standard, and ensures comparability between apples version an independent implementations (if they are even allowed)?

From phone to cluster

I can write software that will scale from a phone to a supercomputing cluster using unix/POSIX C++ and MPI that just requires recompiling for each platform.

No you can't. You can write some software but getting the mapping between hardware and software right is usually a hard problem. Black art often.

OpenMP & MPI

OpenMP can be used for automated parallelisation on an SMP machine, and the code looks pretty normal. MPI requires coding to an API, but it can still run on a phone, although it would be pretty useless on a single core one. Plenty of phones are quad-core though, and could run stuff written with MPI, in any case you code the algorithm once, and provide the processor and host configuration at runtime. This would work fine on any of the really big shared-nothing linux supercomputer clusters.

I don't think a reference to MPI on supercomputers is necessary. Here is one for phones:

business models

This assumes developers exist to create SharePoint applications. This is wrong. Developers (independent or otherwise) exist to service the business needs of their clients.

Clients have already chosen a platform. If their workflow is SharePoint oriented then absolutely the developers (small ISVs) who are engaging with them exist to extend that SharePoint workflow. You are part of an ecosystem not the entire thing. So if you are developing for a particular client(s) then you develop to whatever platform they want. If you are developing as an ISV then you develop to an ecosystem where there is demand. If your product does really well on that one platform you move to others. And for each and every single one of them you write a quality application.

Abstract where you can, but don't where you can't or shouldn't.

You seem to have no answer for these real world problems. Talk of market theory, does not help my bottom line. Do you have any practical suggestions to help businesses in this market?

My answer stays the same for the whole thread.
a) If the applications would work well lowest-common-denominator then use a lowest-common-denominator approach example PhoneGap, HTML5... All platforms support this.

b) If the application benefits from platform specific features write to the platform. Write in a platform dependent way. If the platform dependencies can be abstracted maintain multiple versions. However often huge chunks of the application will often be cross platform which is where LtU type discussion comes in. For mobility hybrid solutions like Xamarin offer a nice way to achieve mostly platform independence.

There is no problem. You are just conflating two entirely different problem domains. The reason generic doesn't work mostly is because the end users want the advantages of platform specific. And that's going to raise your cost, so you pass the cost on. If you aren't being platform specific you aren't solving their problems. That's the practical suggestions. Do what the platform vendors and your end users want you to do. Do it the right way not the fast, cheap way. Stop being lazy.

The PBX point seems inaccurate, we run VOIP linux PBXs here with no special hardware at all in the server they are just regular linux VMs in our rack.

The specialized hardware is then the phones which are doing the SIP conversion and at the carrier which is doing the handoff. You have just outsourced the hardware that does the PSTN handoff. That BTW is a good example of how well platform specific is working. You were using a platform specific approach and it worked so seamlessly you aren't even realizing it. The carriers are doing exactly what I'm advising above.

Giving apple the freedom to restrict peoples software choice

Apple is not restricting anyone's software choice anymore than Coke does when they won't sell you Pepsi in their bottles. Apple customers want Apple hardware, Apple OSes and vendors that are willing to support the platform. You obviously aren't willing to support that platform. And by supporting the platform I mean immersing yourself in the culture and writing to the platform. Apple offers an interfaces for generic software that isn't platform specific they do exactly what you want.

Your problem probably is that customers prefer the platform specific code and you lose the sale when you write generically. Well good. Apple created a market for mobile software that's high quality. And by high quality one of the components is fitting the platform the users are engaging with your application on.

Getting Closer

So lets get back to the original point about Swift and vendor specific languages, as I have answered most of the other points in my response to your other post below.

It would appear you agree with me. You say "However often huge chunks of the application will often be cross platform which is where LtU type discussion comes in. For mobility hybrid solutions like Xamarin offer a nice way to achieve mostly platform independence.". Great, thats exactly my point, and is the reason why a non-vendor specific language is better. Xamarin will not offer as good integration with iOS as Swift, so is a less optimal solution.

I think a non-vendor controlled language would be better for Apple, better for the developer, and better for the end user. Apple has a good record with LLVM and Clang, so lets see what they do with Swift. I would settle for a benevolent dictator if they don't want to standardise the language, but I still think standardisation would be even better though.

Swift licensing

If apple does not own swift, who controls the standard, and ensures comparability between apples version an independent implementations (if they are even allowed)?

Anyone who wants to can create a version. And if they want to they can create a standard. That standard can either match Apple's implementation or not. If it matches there is compatibility if not there isn't. In practice there likely won't be compatibility because the reason to use Swift is to use Cocoa.

In practice there is only going to be one version of Swift that anyone really cares about. I would imagine GNUStep will create a version that allows for porting that will lag Apple by about a decade.

Requires Standardisation

I should clarify, that Apple controlling their version of Swift without any need for compatibility with other versions would need to be counted as a controlling interest. There needs to be a common independent standard, which Apple does not control.

No high hopes here

I would imagine GNUStep will create a version that allows for porting that will lag Apple by about a decade.

It took both the ML and Haskell teams a gazillion years to get any performance out of reasonably mediocre abstract languages. Not to mention Javascript.

I have no idea about the performance of Swift (what's in a name) but considering the semantic gap between that language and C it might die a pretty rapid silent death if they don't get the performance up in time.

Since the creator of LLVM is

Since the creator of LLVM is the principal on swift, I think there is a lot of hope that they'll take compilation and performance seriously.


That's a good sign.

Apparently swift is

Apparently swift is currently very slow: https://stackoverflow.com/questions/24101718/swift-performance-sorting-arrays

460x slower than C++ with optimization. I'm not sure what's happening here, hopefully it will be fixed in the future.

Look at the thread again

That's an excellent article link. Since the original post it became clear what the problem was, the checks for integer overflow. Take a look at the later articles. Compile without overflow checks and Swift was slightly faster than C. With them Swift could end up about 1000x slower (though that's likely to improve somewhat between beta and release). Amazed that operation is that expensive, so I gotta assume that there is poor code there and Apple is going to be able to figure out a way to make Swift safe and not too much slower.

Yes, there is no way that

Yes, there is no way that overflow checks make a program 500x slower. That would mean that 99.8% of the time is spent doing overflow checks. There must be something else going on that causes that much slowdown, so it's likely that performance will improve a lot when that gets fixed.

No too bothered. Ship optimized code

After my hasty remark on the speed of Swift I decided to read up on it and the array performance doesn't bother me that much.

If it's safe, performance degrades remarkably, that's true. But the good thing about it is that they seem to be within a few factors of C speed if compiled with full optimizations.

So I guess most developers will debug, where they can, with safe settings and ship the optimized code.

The problem with that scenario is that I expect that at some point, developers will just compile without array bounds checking most of the time because they'll need to tune and debug to the response of their applications as they will be shipped. Which means array bounds checking may become a nice but dead, as in: hardly used, feature in the language.

I wonder seriously if

I wonder seriously if overflow checks are feasible in non-debug-mode code without hardware support. If the check was added to the CPU adders, then it would be much more feasible.

Overflow checks can't be the

Overflow checks can't be the problem. There is no way that overflow checks cause 500x slowdown. Even 2x would be high overhead. CPU adders already have overflow checks btw, e.g. x86 has the overflow flag bit.

It depends at what level

It depends at what level they are being applied. There are a lot of integer operations in a typical program, most of them incidental not written by the programmer themselves. However, it is difficult to reason about where an overflow can occur, so eliminating checks on these operations might be very difficult. Perhaps this is something where they will get better quickly.

I'm not sure how the overflow flag would be used in Swift's assembly code (or even if, since ARM is in the spotlight here). Its worth studying to those who are interested in the problem.

I don't think there is any

I don't think there is any level at which they can cause a 500x slowdown. Even if what the program is doing is 100% math operations that need overflow checks, if you look at any individual operation, how can the overflow check take 500x longer than the operation itself?

No register + branch prediction + cache miss

Maybe the overflow check doesn't use a register but checks a bit somewhere in costly memory? Then there is the extra cost for the extra instructions, messing with the branch prediction, and a possible cache miss on prefetching whatever wherever the handler is located. (And an extra possible cache miss for getting an array size in case of array bounds checking.)

It depends on the translation and hardware. No idea about their translation; no idea how hardware works these days.

If they cannot get rid of the 500x slowdown I think they should get rid of these checks and move them to a debugging switch. I.e., opt in instead of opt out.

I tried to think of a way to

I tried to think of a way to make an overflow check cache miss but I couldn't come up with anything. It's probably something else, like, they are recording the history of the execution for their timeline debugger.

Shouldn't happen too often

If the branch predictor assumes it may overflow and you have a jump, or call, to a location which isn't in the cache it might happen (since the routine being referred to needs to be loaded.) That shouldn't happen too much but it depends on the predictor and the various cache sizes and behavior.

But it really depends on the code they generate.

For a first compiler they might have opted to implement the checks, or even operators, with subroutines so then my argument is moot anyway. Maybe I assume far more optimized code than they have at the moment.

Loop unrolling?

There's loop unrolling too. Forgot about that one. Usually worth it but may mess with everything too unfortunately.

Peephole optimization?

Yet another one I forgot. I wouldn't know how you'ld mess with that one but it may be that if everything is in subroutines in their normal compilation they can't do the optimizations they'ld normally do on a simple loop.

The timeline debugger is

The timeline debugger is only active in Playground, it seems. Also, they seem to have frame-by-frame recording/checkpointing.

Someone debugged it: Retain Release

It's their reference counting GC it seems. Someone debugged it, it's somewhere in the answers; we all should read less sloppy. Looks like one of the example programs spends most of its time doing the updates to the reference counters.

If I got it right, this is the code they toyed with:

let n = 10000
let x = Int[](count: n, repeatedValue: 1)
for i in 0..n {
    for j in 0..n {
        x[i] = x[j]

The code above is, of course, assuming boxed values, a nightmare for a naive refcounting GC. All it does is retaining and releasing, I imagine, the same value. (It may also release and retain the counters, no idea.)

Wouldn't know

I wouldn't know. The minimum cost are extra machine instructions which you could get rid off with your proposal of extra hardware.

But where are you going to store the array size? The beginning of the array seems the most logical and what size should it then be? The cost goes down if bounds checking is much smaller than swapping in pages; but to access the size implies extra paging.

Looks like there will always be some scenarios, like DSP or imaging, where the extra cost of array bounds checking will be prohibitive.

Array bounds checking is not

Array bounds checking is not the same as integer overflow checking. The conventional wisdom is that array bounds checking rarely leads to unacceptable overhead and is a good default, while software overflow checking is completely infeasible. I don't have any data of my own, so I won't make a claim. But honestly a couple of orders of magnitude would not surprise me. CPUs do an enormous amount of integer arithmetic.

Respond to Jules

You seemed to have overflown yourself here. This is a response to Jules' post above, not mine. I didn't discuss overflowing.

No, I was responding to you.

No, I was responding to you. You were the first one to mention array bounds checking, two comments above. Everybody else was talking about integer overflow checking. It seemed to me that you changed the subject.

The Stackoverflow post

The stackoverflow post attributes the 500x performance degrade to array bounds checks though they were discussing integer overflow checks also/before.

It seems to have led to some confusion.

I see. I'm sorry to have

I see. I'm sorry to have contributed to the confusion!

Same deal somewhere though array bounds checking is worse

I expect both features to be implemented with a test, or comparison, a branch, and a jump to a subroutine if stuff goes wrong.

So the overflow check is less costly since I imagine that's a check on a register whether a bit is set, whereas the array indexing check is a load from memory and a compare instruction. The first messes with branch prediction whereas the second also messes with caching behavior.

So, on the examples shows, you find exactly that behavior back but the examples are also written as worst case scenarios.

Swift looks good from the perspective you can get close to C, and it also looks you cannot get rid of the slowdown (unless they did something really stupid.)

So we learned the same lesson again: Overflow and array bounds checks are costly in certain scenarios.

Triviality for a C compiler writer

It's unlikely that a team stemming from LLVM doesn't know about the trivialities of generating fast optimized code. Now programmers are upset but I doubt they are.

They opted to put these features in but I doubt the team wasn't aware of the potential cost. They may have been surprised by the actual cost of these checks though.


But honestly a couple of orders of magnitude would not surprise me.

It wouldn't surprise you if turning on overflow checks increased the overall run-time by two orders of magnitude? I would expect a non-naive implementation to be in the 2%-20% increase range.


I didn't say "non-naive." I had in mind the most naive possible strategy that literally checks and branches around every single integer operation. I could imagine that degrading performance very, very badly.

Proven track record

As Sean mentioned below Apple is the principle on LLVM. Moreover they were the ones who took GCC's PPC performance up to the point that game consoles could use the PPC-CPU. I'd be pretty confident in their abilities. Further a lot of the craziness in Swift is coming from starting with Obj-C / Cocoa and working backwards. Swift is designed for mid level performance (on their platform) from the ground up.

Even better, Chris Lattner IS

Even better, Chris Lattner IS the LLVM guy, so he should know something about optimization. There is probably a larger team behind this, but technical leadership is also important. I don't think Apple will mess this up; they have the execution focus to make Swift a good programming experience, just like iPhones, iPads, and mac books are good end-user computing experiences.

Chris Lattner?

Do you mean?

Ya, corrected. I knew it was

Ya, corrected. I knew it was a "Chris", it will take me awhile to get used to the name :)

Initial problem with the language

Like C, Swift uses variables to store and refer to values by an identifying name. Swift also makes extensive use of variables whose values cannot be changed. These are known as constants, and are much more powerful than constants in C. Constants are used throughout Swift to make code safer and clearer in intent when you work with values that do not need to change.

When they start conflating variables (associated value can be changed) with constants (associated value cannot be changed), then I have doubts about any part of the language. I am already averse to reading any further. I think I would rather stick with a language like COBOL or FORTRAN (both of which I haven't programmed in for many, many years) for application development.

Wrong thread. There's a

Wrong thread.
There's a thread for discussing the language. This one is for the politics.

My comment was about the politics

I didn't think I was being that obtuse with my comment.

Let me put it another way. A company as large as this should well have the resources to have someone (who has a clue about language design and implementation) vet the documentation of a new language that they are trying to get developers interested in. To make, what to me is a fairly substantial mess-up, the quoted statement on the introductory page of their documentation says that they haven't created a product that will bode well for the future.

Companies that go down this path are essentially trying to have vendor lock-in. This is no different to ORACLE and Java API legal shenanigans or Microsoft and C# and the potential of interfering with Mono under Linux.

I am sorry to say that when simple things are screwed up like this, then the most likely explanation is that the marketing or legal groups within the organisations have got involved somehow.

My ultimate point, I suppose, is that leave bad alone. I haven't bothered to investigate the merits of the language, its available features, it uses or anything else about the language specifics. All of my distaste came about from simply reading the quoted comment on their introduction page.

Over many years, I have had to deal with many vendor specific languages, some have been good, some bad. The bad have been characterised by a lack of clarity, particularly in their documentation. Or in the case of one particular vendor, the removal of usability for the addition of "pretty".

Fair enough, I understand

Fair enough, I understand now what you had in mind.

Variables vs variables

Be careful there, you've fallen for a common confusion in the world of imperative programming. There, and only there, do many (but not all) people equate 'variable' with 'mutable'. In mathematics, a variable always has been a place-holder name, not a mutable entity, and so it is the case in other programming communities, e.g. functional or logic programming, and even in parts of the imperative world.

So you should probably reconsider brushing off the entire language just because you're unfamiliar with the author's use of (common) terminology.


That's an interesting point. It's a bit unfair, since obviously the fact that variables range over something is right there in the name. Why call it a variable if it's not, you know, variable? But in mathematics, I suppose a variable is usually actually more like a kind of meta-variable, ranging over possible values but in any particular case taking only a single one.

It's not so common in mathematics to use variables that implicitly vary over time, and in domains where it is common, such as physics, people commonly complain about confusing notation (e.g., using 'v' rather 'v(t)' for velocity, even when it really is a function of time).

Anyway, like I said, it's an interesting point.

So variables in math are not

So variables in math are not variable, but constant?


In math the terms of a formula which are meant to be (essentially) beta-reduced, conceptually or actually, are sometimes called "variables" because they stand for an indefinite value. Even though they are called "variable" the value they denote does not vary within the scope of their meaning.

I haven't looked at Swift's details seriously. Is it the case that the value of a Swift (constant) variable can be undetermined until run-time and also unvarying within the dynamic scope of its binding?

math variables

They're variable.

Math does use symbolic constants, like tau (circumference of unit circle) or c (speed of light in a vacuum) or k (Boltzmann's constant) or e (Euler's number, related to natural logarithm).

The error was in the transition to imperative programming, which unfortunately conflates variables with mutables. That's another billion dollar mistake, IMO.

C (math variables)

C has const variables and people talk about them that way.

const variables

The 'const' keyword is useful, and I make heavy use of it in C and C++ code. When was it introduced to C? Was it C99? I know it was backported from C++, and that some ad-hoc conventions existed for it before then.

But adding const vs. mutable properties to variables is still coupling too many concepts, IMO. I'd argue the only languages that do 'mutables' right are those that cleanly distinguish mutable structures as a special kind of allocation orthogonal to variables and structured values, e.g. Haskell's `newIORef`.

const is old school too

I'm pretty sure C code like this was idiomatic in my 1983 copy of K&R:

static const char* my_names[] = {
    /* ... */

Some runtimes advertised intent to put such constants in readonly memory, so you'd SEGV trying to write them as lvalues. But absent such protection, const is a advisory status observed cooperatively, but with compiler static checks catching any failure to cast-away-const with a suitable cast expression.

Along with proper use of namespace and code organization conventions, const is a way to inform other developers which parts of code might be able to influence a computation you want to understand. The fewer places that can affect something, the easier it is to understand, because only those places need be studied to grasp original developer intent. Epistemologically speaking, smaller scope is better style.

It was ANSI C, via early C++

Const was added by "ANSI/ISO C" (a.k.a. C89/90), and accordingly, only appeared in K&R's 2nd edition (1988). It was available in some C compilers before that, though, who imported it from early C++. I believe C++ introduced const as early as 1983.

Const has always been an unfortunate misnomer in C/C++. It does not mean constant at all -- it means read-only. The difference being that e.g. you cannot write to an address through a const pointer, but it can still be mutated legally via other alias references.

Or via a cast

There are languages IIRC that have both "readonly" and "immutable" references. Only immutable objects can be passed to immutable references, and immutable means "nobody can change it, ever". Conversely, both mutable and immutable objects can be passed to readonly references, which mean "It can't be changed through this reference but someone else might hold a mutable reference to it".

BitC tried this and stumbled

Scott: We wrestled with this distinction between an immutable object and a read-only view in BitC and stumbled badly. If you know of a language that has managed to pull it off, I'd really appreciate a pointer to any paper or writeup that may have emerged.

Rust does this (n/t)

I said n/t.


I'd say SQL would be the best example. The language has the concept of views which quite often readonly and do quite complex things. The underlying objects are mutable.

read only view is then the return value of an expression which just reads
while an immutable object is a table (or table space) which cannot be written to.

re sql

I think it is more natural to say that, in sql, all query results whether from views or tables, are immutable values. Views are references to (usually mutable) queries. Used in a query, a view produces an immutable value in an unremarkable way. The query "underlying" a view may reference tables which can not be updated using the view itself -- but this is not the same thing as saying a view is a read-only reference to those tables.


You are forgetting the whole permissions system and the splits. Views (particularly materialized view) are often running on different servers than the ones with the mutable data, they just get the updates. Moreover the people with view access don't have access to change the structures. IMHO going to the underlying query and saying that's mutable is like saying an immutable variable is mutable because the value is being stored in mutable RAM.


the people with view access

I have no idea what that could mean in this context. I thought we were talking about programs.

In any event a client may or may not have permission to perform a certain mutation but this does not mean that SQL contains two kinds of reference to a view or that a view is a constant reference to any tables.


GRANT privileges ON object TO user;
REVOKE privileges ON object FROM user;

is part of the language.

re permissions

I don't see any way that supports your original claim and you haven't tried to explain. The relevant part of the thread starts with the comment titled "Or via a cast".

Permissions and SQL

It is pretty simple. SQL has a clear, effective, heavily used and useful distinction between immutable (for an individual user / program) and readonly. It serves as an example of precisely what was being asked for.

not "precisely"

I think you'll find that Scott Johnson was talking about two kinds of references to a common object where "object" is talking about something like an entity within the memory model of a single process. The question is about how to have these two kinds of references at the language level. Shap is talking about (unspecified; perhaps typing related?) difficulties trying to do this reasonably in BitC.

It's a stretch to begin with to try to say that there are even "references" in that same sense in SQL or to assume too much of a simple relation between database transactions and, say, the memory model in BitC. Moreover you've wandered in your point from saying something about "views" in SQL to now saying something about "clients" and their differing permissions.

But, perhaps I misunderstand you. Everyone can make mistakes and it isn't always easy to recognize or acknowledge one's own, no?

We wrestled with this distinction between an immutable object and a read-only view in BitC and stumbled badly. If you know of a language that has managed to pull it off, I'd really appreciate a pointer to any paper or writeup that may have emerged.

Is there an SQL paper or writeup you'd recommend in this context?

D2 has const and immutable

They are both transitive. All objects reachable through a const reference are read only.
Type Constructors

Modifying an immutable is considered undefined behavior.

Template code has const and immutable attributes inferred allowing for single implementations of functions that work on mutable, cost, and immutable data.

log of stumbling?

what went wrong? details would be nice to hear, to learn about this garden-path-paved-with-good-intentions, and to see if it sparks any ideas in other folks here.

I assume you're correct

That I misremember seems likely then. If I run across my tattered 1983 edition in a box soon, I'll follow up, but I may no longer have it. Perhaps the addition of keyword void from C++ at that time hogged my attention at the expense of more obvious const. Edit: My online searches reveal nothing about the history of the const keyword, but it gets mentioned in the same breath as other C89 changes along with void. Now I'm starting to recall an idea const was already in use, but in fewer places, and had new uses added to match C++ in some ways.

they vary until bound

So variables in math are not variable, but constant?

(Not an expert.) Variables are names bound to different things depending on context. In math the distinction is usually between free and bound variables (wikipedia: Free_variables_and_bound_variables).

Imperative languages bind to a storage location that's mutable unless declared immutable (e.g. using const in C). Functional languages aim to bind to the value inside, so storage location is an implementation detail, presumed not to matter as long as never mutated when still reachable from a past binding.

Variable vs Constant

Thanks for all the comments. The political commentary continues.

For those who argue against my choice of phrasing, let's look at the following definitions:

1. apt or liable to vary or change; changeable: variable weather; variable moods.
2. capable of being varied or changed; alterable: a variable time limit for completion of a book.
3. inconstant; fickle: a variable lover.
4. having much variation or diversity.
5. Biology. deviating from the usual type, as a species or a specific character.
6. Astronomy . (of a star) changing in brightness.
7. Meteorology . (of wind) tending to change in direction.
8. Mathematics . having the nature or characteristics of a variable.

9. something that may or does vary; a variable feature or factor.
10. Mathematics, Computers.
a. a quantity or function that may assume any given value or set of values.
b. a symbol that represents this.
11. Logic. (in the functional calculus) a symbol for an unspecified member of a class of things or statements. Compare bound variable, free variable.
12. Astronomy, variable star.
13. Meteorology.
a. a shifting wind, especially as distinguished from a trade wind.
b. variables, doldrums ( def 2a ).

1. not changing or varying; uniform; regular; invariable: All conditions during the three experiments were constant.
2. continuing without pause or letup; unceasing: constant noise.
3. regularly recurrent; continual; persistent: He found it impossible to work with constant interruption.
4. faithful; unswerving in love, devotion, etc.: a constant lover.
5. steadfast; firm in mind or purpose; resolute.
6. Obsolete . certain; confident.

7. something that does not or cannot change or vary.
8. Physics. a number expressing a property, quantity, or relation that remains unchanged under specified conditions.
9. Mathematics . a quantity assumed to be unchanged throughout a given discussion.

So irrespective of one's viewpoint, variable has the sense of changeability and constant has the sense of fixity.

When language designers then conflate the two concepts as if one is a special case of the other then one must question what they are saying and what they are thinking. Over the decades, I have seen quite intelligent (if not brilliant) people creating new languages and still messing up such concepts.

As I sit here, I am reminded of two things. The first is KISS and the second is BBB from a science fiction story by Eric Frank Russell called "Next of Kin". Too often, I find the papers written on language design, etc, ignore the first and follow the second. There are a number of authors who manage to do the first and avoid the second. Papers of this quality are a pleasure to read and even if the concepts are difficult, well worth the effort.

I have a grandson who is not yet 5 and has his own little electrical workbench, loves electric motors and in his own simple way can explain what is going on with them. His little sister (just turned 3) has quite a grasp on technical matters as well. The point - if you say something that is not quite logical, they pick you up and say you are being silly. Getting back to my original point, the fact that such documentation was let loose on the word from an organisation that should have the resources to vet such things is a fairly substantial mess-up and other things are in play that do not bode well.

Andreas, getting back to your comment, I am more than happy to brush off the entire language, since by your own usage of variable still indicates changeability not fixity. If the terminology has got to the point of conflating what are conceptually different things, variables and constants, as one being a subset of the other then no wonder why much of the industry is right royally messed up. In that regard, I'm glad I am now able to program and study at my pleasure and not have to deal with it as my working industry.

Hence, constants always represent the same thing, whereas variables are used to represent something that can change (irrespective of whether it is just a place holder or represents a mutable entity within the relevant context).

Since functions can be

Since functions can be called multiple times, there is variability in how their arguments are bound on each call. I'm sure the mathematical notion of variability is related to something like that.

I don't know. I think

I don't know. I think informal mathematical language supports Bruce's position. For example, I think it would be odd to hear a mathematician refer to pi as a variable. It is a constant, not a variable. The problem with Bruce's preferred terminology is that he'd have to introduce a new name for 'symbol representing a value', which is really the more fundamental concept and what we take 'variable' to mean, or forever repeat 'variable or constant'.

The point is that pi's value

The point is that pi's value is fixed in our reality, while x, y, and t are not. In f(x), x is variable, and we often graph f(x) against a range of values for x.

Variability simply means what symbols have fixed values (pi) and what do not. So take an arbitrary ID in your code, and if you can ascribe one interpretation to that ID (say PI, a non-variable type, or a direct procedure name with one implementation), it is a constant...otherwise it is technically variable.

I have no problem with

I have no problem with symbols representing different things. What I have a problem with is conflating conceptually different things as being the same kind of things, in this case variables and constants.

If a concept is "orthogonal" to another concept then let's keep them separate. If there is a cross over, let's try to find what are the real differences and deal with those. As in above, the concept of associating a symbol with something is separate from the something that we are dealing with. I hope that is clear, it's late and I'm off to the land of nod.

Beware the general case

Note that constant bindings generally are variable, too:

function f(x) {
  const y = x + 1
  // ...

In what sense is y a "constant"? The problem with the distinction you are trying to make is that you are only considering a special case. In general, the differences are far less clear.

In any case, you are up against a century of CS tradition. The reason that constant bindings are not usually considered a separate thing is that they are simply a degenerate case of the general concept. And for the semantics aficionados, any bound identifier is just a short-hand for a lambda parameter anyway -- recall the name of this blog. ;)

Presumably y wouldn't be

Presumably y wouldn't be constant at all under Bruce's definition. In C++ lingo, I think he's talking about the constexpr specifier rather than the const specifier.

Here is where I would argue

Here is where I would argue that we have a situation where we have partially merged two concepts. In this case we have a variable y that we have added an additional facility of not having a means to update the value after assigning a value to it. Irrespective of the fact that the language has a keyword indicating no update and may be used used to define an association between a symbol and a constant as well as an association between a symbol and a "variable" that we give you no ability to later update.

There is conceptually different semantics involved but you are using the same syntactic form to represent both.

You make my point though in that the concept of binding a symbol to a value is separate to whether or not that value represents something that is permanently fixed (a constant) or can change (a variable).

Semantics are implied or explicit.

I agree it is confusing. In functional languages we have symbols that are one-time-bound to values (constant) and references (mutable). In C symbols can be bound to values, and rebound. In C the type of the symbol is statically determined and in functional and logic languages the type is dynamic. The semantics of variables is obviously dependent on the language, so should either be obvious from the context or stated specifically.

It seems a similar situation to types, where the arrow in a function type has different semantics depending on the language, pure in Haskell, with side effects in ML.

So for a new language where we can't imply the semantics yet, they need to be explicitly stated.

Exactly, but...

You make my point though in that the concept of binding a symbol to a value is separate to whether or not that value represents something that is permanently fixed (a constant) or can change (a variable).

Yes, exactly, and in PL theory circles this distinction is standard. However, a name introduced by a (any) binding is consistently called a variable (and being subject to standard meta-operations like substitution), and an object that is mutable is called a reference cell, or assignable, or other names. Variables can be bound to reference objects, but don't have to.

As has already been hinted at by others in this thread, some (mostly modern) programming languages also make this distinction, notably the ones in the ML tradition. There, bindings just name values, while reference values are their own type. In fact, Algol 68 already made this distinction.


The nice treatment of variables in ML simplifies both programming and reasoning about programs. François Pottier remarked that the mutable variables used in the common presentations of (concurrent) separation logic are an extra source of complexity that goes away if we go back to the ML tradition. We should kill the mutable variables.

(In practice, adoption-oriented language such as Swift or Scala rather go in the opposite direction of having a class of mutable variables, the only difference with a restricted use of references being that there is no explicit mark on variable access. Interestingly, the Swift syntax to access option types, foo!, is very close to the ML notation to access a reference, !foo. In a few generations, programmers may come to think that being explicit about mutable access is a decent default choice.)

Implicit mutability useful

These transformations from mutable variable use to immutable variable can lose information. Say I have a mutable variable X and use SSA to transform it into XA and XB. If I run a dependency tracker, I lose the fact that XA and XB were the same X before, and am now tracking separate XA and XB variables! That I was reading and writing just X is semantically useful information, and we've lost that by using immutable variables. There is more to life than basic code execution!

There are cases where a mutable local variable is useful meaningful, then there is just haphazard cases of using variable v for a multitude of unrelated things, but I don't think the latter is very common.

There is more to life than basic code execution!

No there isn't.

Right, I get that functional

Right, I get that functional programmers hate debuggers.

Who? What? Where?

I am not a functional programmer. Just making a joke. People who get that there's more to life than code wouldn't frequent LtU a lot, I imagine.

Fair enough. I should have

Fair enough. I should have said "we do more with code than just basic execution" or something like that. Functional PL is supposed to make life easier for everyone, but often makes things much much harder.

code transform

Whenever we transform code (e.g. compilation, inlining, optimization), we can lose useful information (e.g. types, location, comments). But that's just a concern for transformations in general. I don't see how an argument against transformation serves as an argument in favor of implicit mutability.

As far as debugging goes, even in imperative languages I find it much easier to debug if I write code in an SSA style. Why would I ever need mutable X when I can have XA and XB for different steps A and B?

It helps when you are trying

It helps when you are trying to reason about how X changes. Without an explicit notion of X, then you have to do that processing all in your head.

reasoning about change

There are, of course, many cases where reasoning about state and change is essential. But your specific example - i.e. where an SSA transform from mutable X to immutable XA,XB is feasible - does not seem to be such a case. Rather, trying to reason about how X changes is accidental complexity due to unnecessarily expressing the program in terms of mutation.

If I can avoid reasoning about change, that's easier on my head. If mutability is explicit, I can avoid reasoning about change except for a few explicit elements.

Say you are writing a parser

Say you are writing a parser by hand (it happens a lot in industry). You have a cursor variable for a parse tree method, its an argument and an output. Cursor itself is an immutable value, allowing us to use it safely to identify landmarks (e.g. other parse trees). Whenever a tree is parsed recursively, it takes cursor and returns a new one.

Now, let's just assume no loops to make it simple. Would you prefer to manually convert cursor so that we have cursor0, cursor1, cursor2...cursorN for however complex our parse method is? The code not only becomes very obfuscated, but is difficult to refactor and debug. Oh, and what if I accidentally use cursor4 when I really should have used cursor5? Oops, that's a bug.

My old parser actually "returned" new cursors rather than using C#'s ref keyword to just update it in place, but this was very error prone, because I would forget to reassign the new value and what not. Perhaps linear types would help prevent errors, but barring that...ya, FP is quite difficult without using at least some kind of monadic style.


Use cursor_aa7fa526_b02a_4253_b1e1_cb2e670a4b51, cursord_2357b02_61f1_45b6_b6d5_5c75025632b2, etc. instead of cursor1, cursor2, etc. That way when you add a new cursor in between two existing cursors, you only have to change references to one cursor.

Uhm...no :)

Uhm...no :)

re GUID identifiers and Uhm ...no :)

Well, with a suitably sophisticated IDE this should be no problem.

(This is a political topic post and sarcasm is allowed, right?)

re: cursor example

If we have a lot of repetitive code that we're likely to get wrong, we should try to abstract the repetitive elements. In this case, we might abstract glue code by using a parser combinator or state monad.

This example does not seem to involve any "reasoning about change". The main proposed benefit seems to be removing names or values from scope, i.e. such that you cannot use `cursor4` twice (by accident). It seems to me there are many other ways to approach that particular concern.

I'm not particularly fond of named local variables when I can avoid them. But if I did use them, I'd probably use names like `cursorAfterPhoneNumber` instead of `cursor4`. Among other things, such names are much more likely to be stable across changes in schema, and also easy to debug, refine, and refactor (no pressure to renumber or rename).

If we have a lot of

If we have a lot of repetitive code that we're likely to get wrong, we should try to abstract the repetitive elements.

Should we? Although I'm inclined to think powerful abstraction facilities are desirable, this is because I don't want the programming language to limit my choice of abstraction — not because I "never met an abstraction I didn't like". On the contrary, I've met quite a lot of abstractions I didn't like, and often felt it was the best that could be done in that programming language but the language should have been able to do better.

Verily, the elementary case in favor of arbitrary abstractions is to reduce the chance of making mistakes when doing the same thing again and again... but surely there are drawbacks as well. How about greater difficulty of understanding, together with greater difficulty of dealing with things that don't quite fit the pattern? Which amounts to, harder to use and less flexible. Presumably it'd be easier to mitigate these problems in a language with more versatile facilities for building abstractions (i.e., "more abstractively powerful"); but it doesn't follow that all the things thus facilititated are specifically desirable.

awkward abstractions

As you mention, languages often limit abstraction. But awkward abstractions in languages often become DSLs, scripts, or frameworks. I believe that a balancing point for repetition between 'should abstract' and 'shouldn't bother' can vary a fair amount based on the language and context, and how easy, natural, and smoothly integrated the necessary abstraction would be.

We would do well to develop languages or paradigms more effective at abstracting & integrating frameworks.

Regarding 'understanding' - one must be careful to weigh the overheads for comprehending a new abstraction vs. the aggregate overheads and semantic noise from repeating a poorly abstracted structure and trying to see the important differences. Either can contribute to misunderstanding. I believe this rolls into the aforementioned balancing act.


Why do this, what is wrong with symbol rebinding? References are a problem due to potential side-effects, but let the compiler do the loop unrolling. I would also just plain return the cursors.

The cursor object can be immutable, but you can allow rebinding, so that you cannot change any properties on the cursor, but you can swap cursors.

f(cursor x) {
   do {
       result = parse(x)
       x = result.x
   } while (result.success)

With move semantics returning local objects from functions is very efficient, avoids new/delete avoids references etc.

re symbol rebinding

Rebinding is nicer than mutation in contexts involving lexical closures, side effects, parallelism, or lazy evaluation. But it can still be relatively difficult to reason about, e.g. in cases involving conditional rebinding within a loop.

To the extent we use it at all, I think there is value in favoring tail calls or loop variables (e.g. `foreach foo in fooList`) to express rebinding in a structured manner. Equational reasoning is easier with a clear lexical scope for each binding.

Foreach and Algorithms

Yes, 'foreach' is a nice way to do rebinding, and its clearly structural , and will terminate with data (but not co-data).

If reasoning about code is easier, why are algorithms written using symbol rebinding? If I pick up any book on algorithms, for example Tarjan's little book on network algorithms, they are all expressed in a style using value semantics and symbol rebinding?

network effects

If reasoning about code is easier, why are algorithms written using symbol rebinding?

Context and history shape our actions. Algorithm books are written in context of PLs and tools familiar to authors and their audiences. People can push an alternative notation, but it is difficult to teach or learn two new things (both algorithm and notation) at the same time. Authors and audiences are understandably reluctant to try.

Most questions of the form "If foo is better, why isn't it more popular?" - in ANY domain - are remarkably naive to network effects, path dependence, and various social forces. Quality is a minor factor in success.

Natural Understanding

Is it possible that it easier to reason and construct algorithms using symbol rebinding? It certainly leads to more compact notation, and it may be easier to intuitively grasp the meaning. One should consider that the simpler and more natural notation is the one humans first thought of, and as humans are the primary consumer of these things, the more natural notation is somehow easier? The first algorithms were written before PLs were even (offically) thought of.

I like Stepanov's view on this, it is mathematics that has to improve to cope with memory, not programming that has to give it up (apologies for the bad paraphrasing).

natural misunderstanding

To clarify: I don't consider all forms of 'thinking' to be 'reasoning'. While some forms of thinking may seem easier or more 'natural', many are biased, fallacious, imprecise, inaccurate, or otherwise ineffective. Mathematics and logics provide a basis for effective thinking, i.e. 'reasoning'. And equational reasoning is among the easier approaches of reasoning for humans to grasp.

Programming today is consistently buggy and overbudget. Even algorithms found in books are often flawed. I disagree with your opinion that it's mathematics, not programming, that should change. But I would agree with a weaker statement: that it would be nice if mathematics can find effective expression using metaphors we humans find more natural or intuitive. (e.g. I'm experimenting with leveraging models of hands and zippers to model a game-like programming environment, related to math in terms of category theory. There seems to be a useful similarity between editing gestures and groupoids.)

Whose Opinion?

It wasn't my opinion, but that of Alex Stepanov, who is a mathematician and a programmer and so is probably qualified to make such a comment. I tend to agree with him.

Knuth is written using assignments too, and he also is a mathematician.

It would seem the mathematicians have less problems with rebinding symbols than the computer people.

Knuth latest remark

Was something about that all geeks know that at some point you'll need to add one to a counter.

I take that as that he assumed a position in the latest debates about where to go with PLs.

Somewhere, I thought it was also due to a debate sprung from the latest remarks of Barbara Liskov on PLs. But I guess that's connecting dots which aren't there.

Absolutely not. People

Absolutely not. People actually do think in terms of time and change, and symbol rebinding is just an extension of that. To get people to think timelessly (i.e. more mathemetically) is actually quite difficult, and not as natural to us.

So ya, if 100 thousand years of evolution vs. 2000 years of math should be considered a "network effect" then maybe you are right.

People think in terms of

People think in terms of time and change, I agree. But they aren't very good at it, not even at the small scale of estimating where two trains will pass each other or dodging a punch, much less the medium scale of a block puzzle or larger scales. I think one could present a good argument that humans are better adapted to static and timeless thinking.

Beyond that, treating symbol rebinding as sufficiently analogous to the motions and changes people grok seems a stretch.

Besides, those 2000 years of math have seen the greatest advances in human technology. This suggests that reasoning timelessly is much easier for us. (Again, reasoning isn't merely thinking or reacting or intuiting. It's thinking effectively - cogent forms of thought and argument.) As I mention above, I distinguish thinking from reasoning. Easier thinking is only useful insofar as it's effective. Those 100 thousand years of prior history aren't exactly a monument to effectiveness of natural human thinking.

Humans aren't adapted to

Humans aren't adapted to static/timeless thinking at all. That we are capable of it through lots of training is more due to our flexibility (an accident that was not subject to natural selection), not some innate ability that was useful for hunting mastodons. That programming is the way it is today is really no historical accident at all.

Symbol rebinding is "change" to the person even if it isn't really change to the computer (via SSA transformation). Take away this ability and lots of code becomes harder to write, or has to be completely re-architected to become elegant again (e.g. using parser combinators rather than recursive descent).

Hunting mastodons and theorems

I don't think we known nearly enough, yet, about the strengths of stateful thinking, to be able to make an informed judgement on its merits and demerits. Indeed, what we know about the merits and demerits of different ways of thinking may be more random than what we've evolved to be good at; we may have evolved to think a certain way because that aproach has some advantages that our meta-reasoning, still in its infancy, hasn't caught up with. My intuition tells me there are deeper things to say in favor of stateful thinking than what we grok atm; and until we know what those things are, we won't be able to make an informed judgement, one way or another. Stateful thinking is apparently good for lots of things we needed to do during our evolution; stateless thinking is apparently good for applying our current formal tools to various kinds of reasoning. The latter may be more coincidental than the former (or not; but I don't think we're ready to make that call).

loop invariant

Humans aren't adapted to static/timeless thinking at all.

I find this assertion utterly ludicrous. The vast majority of human experience and thinking involves places and things that don't move about on us, or conditions and behavior that are predictable primarily in their similarity from day to day or year to year.

It is my observation that most human thinking favors static systems, and further that humans are relatively uncomfortable with significant changes of any sort. But I'll grant that some forms of 'timeless' thinking are more of the 'loop invariant' nature - e.g. day/night, work schedules, habits, traditions, seasons, holidays and weekends - and thus may not seem timeless unless you step back a little.

And other forms of timeless thinking do involve moving parts. E.g. we like to think of roads and plumbing in 'timeless' ways, even though they serve as platforms for flowing elements. When this breaks down - e.g. with traffic jams - we have a difficult time understanding why... we're better at understanding the static parts and stable behaviors than the dynamic ones.

not some innate ability that was useful for hunting mastodons

Hunting effectively also takes a lot of training or practice. It isn't some natural ability we're born into. And hitting a moving target with a thrown spear involves only a few 'moving' things - hardly analogous to the scale of a modern software application.

Take away this ability [to change] and lots of code becomes harder to write

I do not suggest taking away the ability to express programs in terms of change. But change does easily hinder reasoning. I would suggest making it easy to control, e.g. by avoiding it as an implicit or global feature, favoring determinism and repeatability, supporting confinement or isolation.

I find this assertion

I find this assertion utterly ludicrous. The vast majority of human experience and thinking involves places and things that don't move about on us, or conditions and behavior that are predictable primarily in their similarity from day to day or year to year.

That is not stateless however or even formal. We've been able to replicate this behavior in the lab relatively easily (ML), and it didn't involve time-free thinking on the computer's part. Rather it is our ability to abstract over multiple events that is key, rather than existing in static systems, which we don't.

Hunting effectively also takes a lot of training or practice. It isn't some natural ability we're born into.

Are you sure about that? Obviously, most of us don't do it anymore, but back when we had to, it didn't require "a lot of training or practice."

favoring determinism and repeatability, supporting confinement or isolation.

Of course, but we can do that without taking away convenient local variable reassignment.

Regardless of evolving in a

Regardless of evolving in a stateful system, we humans tend to think about things as stable or static. We have difficulty reasoning about more than one thing changing at a time, much less emergent behavior.

I've not claimed we exist in static systems or that we are unable to abstract over events or motion. Rather, I've claimed we aren't very effective in our thinking about dynamic systems, that we are much more effective in thinking about stable and invariant behaviors.

This is perhaps because, in a static system, there is simply less to think about. But it seems ridiculous to deny that most things in our experience really are stable, and we think about even changing things (like human faces) as stable, and that this is often good enough.

As a trivial example of how even dynamic systems are both mostly static and framed statically by humans: your toothbrush doesn't move on its own, and it tends to have a stable shape from day to day by which you can recognize it, and most likely you put it in the same approximate place every time you're done with it, and you'd be upset if other people moved it on you.

Are you sure about that? Obviously, most of us don't do it anymore, but back when we had to, it didn't require "a lot of training or practice."

Training and practice are a big part of the cultures where hunting was important. But yes, I am sure we require training and practice to hunt effectively. Study some historical cultures, esp. regarding how children are raised, if you don't believe me.

we can do that without taking away convenient local variable reassignment

Depends on what you mean by 'convenient' and 'local'. In any case, such constraints only mitigate; it's still more difficult to reason about a program when symbols have different values or behaviors from step to step.

It isn't as though I'm completely against expressing programs with change. Stack shuffling functions, even in a purely functional concatenative language, have similar issues for reasoning vs. convenience or intuition (which is not the same as reasoning).

Referential Transparency

Isn't referential transparency and call-by-value much more important for reasoning about programs than symbol-rebinding? In other words we can allow symbol rebinding and keep the other two, isn't that a good compromise?

referential transparency

Referential transparency assumes a referential context - e.g. a syntactic scope in which a fragment of code that references its environment has a stable meaning. This context allows us to reorganize expressions, insert or remove statements, and otherwise manipulate subprograms without altering their meaning. There is a tradeoff between how flexible is your symbol rebinding vs. how difficult it is to reason about this referential context and thus leverage referential transparency. As I mentioned earlier, rebinding in structured ways is more amenable to equational reasoning - of which referential transparency is an example.

Call-by-value seems like a separate issue entirely. I suppose it might be important for reasoning about certain side-effects, divergence, and exceptions models. In the last few years, however, I've been favoring language designs where CBV is just an implementation/performance concern.

Ease of Reasoning

My interpretation of Stepanov's comments is that its okay to make reasoning about code harder, if it makes coding easier. We spend a lot more time coding that we do reasoning about it, so that would be the correct optimisation.

Hen and egg?

Maybe people reason so little because languages are optimised for making it hard?

We spend a lot more time

We spend a lot more time coding that we do reasoning about it

Do we really?

The average number of lines of code manipulated per programmer day is quite low, albeit heavily front-loaded with a long tail for maintenance and enhancement. I've seen estimates in the 10 LoC to 25 LoC range, though I'd be more interested in some real empirical stats from github or sourceforge. Assuming an average 4 hour workday in front of code (the rest spent on meetings, e-mail, travel, powerpoint, etc.) that's perhaps a line of code every ten minutes.

Meanwhile, time spent reasoning about code broadly includes isolating bugs, figuring out where to add an extension, reviewing APIs and documentation, exploring ideas, determining how to achieve some goal without breaking the system.

In my own ideal, the labor barrier to actual coding would be as small as possible... i.e. such that new ideas are readily expressed in an unceremonious, parsimonious manner without any need for boiler-plate (such as imports, public static void main, initializing subsystems, etc..). Similar features are also useful for scripting.

Of course, some people have different ideals. E.g. in Why I Like Java, Mark Dominus describes Java:

Java is neither a good nor a bad language. It is a mediocre language, and there is no struggle. In Haskell or even in Perl you are always worrying about whether you are doing something in the cleanest and the best way. In Java, you can forget about doing it in the cleanest or the best way, because that is impossible. Whatever you do, however hard you try, the code will come out mediocre, verbose, redundant, and bloated, and the only thing you can do is relax and keep turning the crank until the necessary amount of code has come out of the spout. If it takes ten times as much code as it would to program in Haskell, that is all right, because the IDE will generate half of it for you, and you are still being paid to write the other half.

I wonder if Java developers might agree with your broad characterization that "we spend a lot more time coding".

Sounds Awful

That is completely the opposite to what I intended. Algorithms should be published, shared, and refined. They are never done, they can always be improved.

As I pointed out the mathematicians who actually do a lot of reasoning about code, like Knuth, seem to have no problem with rebinding symbols.

Stepanov identifies the key property for reasoning about code as regularity. This even allows mutable data and arguments to be passed by reference. The key property is that a regular-proceedure returns the same result if called with the same values (or objects passed by reference in the same state). A regular-datatype has value semantics when assigned etc.

the mathematicians who

the mathematicians who actually do a lot of reasoning about code, like Knuth, seem to have no problem with rebinding symbols

To point out that Knuth uses rebinding of symbols (which seems rather necessary for his target audience) does not imply it causes him no extra challenge with reasoning. Knuth tends to keep algorithms small and isolated, which helps resist and mitigate any resulting aggregation of complexity.

Algorithms should be published, shared, and refined. They are never done, they can always be improved.

Refinements require thinking about code. Reusing code means less time writing it. Thus, publishing, sharing, and refining algorithms, e.g. expressed as code, tends to favor the "thinking" side of the thinking-vs-writing spectrum. If you think the opposite, I'd be interested in your reasoning.

Stepanov identifies the key property for reasoning about code as regularity.

Stepanov should try debugging Cow or Whitespace or perhaps a program expressed as a cellular automaton. He'll soon reject his hypothesis that 'regularity' is the key concern for reasoning about code. ;)


Okay, I walked into that one. Clearly regularity is not the only property, but it is an important one. That's clearly my mistake not Stepanov's. However I take your CoW and raise you an Unlambda.

Most reasoning I can see being done about programs happens with their types, and as long as symbols keep the same types, changing their values does not affect that.

Clearly dynamic languages like Python and JavaScript are moving further from this, yet people seem to find them easier to use than statically typed languages.

Ten years ago I would have agreed completely that functional programming with immutable variables was the way forward. The problem is it does not make things easier. Trying to implement graph algorithms in functional languages is not easy, and our languages combine ad-hoc sets of features that make it difficult to determine which features contribute to making things better or worse.

My experiences lead me to believe that a quantified polymorphic type system with something like type classes is a good thing, and that attempts to add polymorphism in an ad-hoc way to languages results in a mess (C++ templates, Ada generics, Java generics). But I have also found that squeezing everything through a topological-sieve to keep everything immutable makes some very simple and elegant algorithms hard to understand and harder to debug. There are many papers on implementing efficient union-find in functional languages and none of them are really satisfactory. I would rather take Tarjan's definition straight from the book, have access to all his reasoning, and use that. The code is smaller, more readable easier to maintain, and I find it easier to reason about than all the work on persistent and semi-persistent data structures. I find that compactness and simplicity are more important properties for me personally to reason about code than whether symbols get rebound. I find "x = x + 1" to be pretty easy to think about. Stepanov also says that he thinks the theory of iterators is as fundamental and important to programming as rings are to number-theory.

The efficient functional union-find using semi-persistent data structures actually cheats and uses an internally mutable data structure which is carefully constructed so it looks read-only from the outside.

My personal view at the moment is that I need a language with a low-level imperative sub-language, and a declarative (immutable) high level language for abstractions.

lots of comments

Assuming 'regularity' refers to something like 'avoids special corner-cases' then I agree it contributes to reasoning.

Types are useful for reasoning, but often fail to address reasoning about performance, concurrency, deadlock, divergence, partial failure, physical safety (e.g. in robotics), battery/power consumption, and many other properties. A more sophisticated type system might help, but today we need to reason about values in many ways.

Dynamic languages are easier to use because we're free to use them incorrectly. A few statically typed languages have taken notice, e.g. GHC's `-fdefer-type-errors` and the recent interest in gradual typing.

I grant that mutability is convenient for implementing graph algorithms, and I often use a State monad to express graph algorithms.

Regarding performance of union-find and other algorithms, I find it amusing that advocates of imperative languages claim to do better than O(sqrt(N)*lg(N)), which is the physical limit for random access to uniquely identified resources distributed in 2D space (or cube root for 3D space).

In practice, most complexity in programming is of the aggregate sort - lots of simple parts are combined into a rather complicated mess. While an example like `x=x+1` is small and trivial to reason about by itself, it contributes more than immutable variables to complexity in a larger context that uses x.

IMO, zippers and streams (depending on use case) are a fine replacement for conventional iterators, and are easy to use in a purely functional manner.

If you like the borderline between functional and imperative, give linear types a try. :)

Linear Types

Tarjan's union-find is O(m * inverse-ackerman(m, n)), which is better than O(m * lg*(n)), and is the best I am aware of. It seems difficult to get close to this with functional language yet the imperative code is just 4 or 5 lines for each function.

I don't see how any amount of streams and zippers can beat that elegant simplicity and efficiency :-)

So I am looking at something that allows small containerised imperative components to be plumbed together generically using a declarative logic language that integrates with the type system. The imperative parts may be monadically typed, i'm not sure yet, but the idea is type inference deals with all that, you just write a little imperative program and let the compiler worry about the types.

I am looking at linear types and linear logic at the moment, but I don't know if its what I wan't yet.

Tarjan's union-find is O(m *

Tarjan's union-find is O(m * inverse-ackerman(m, n)) [..] It seems difficult to get close to this with functional language

When big-O terms ignore speed of light, size of integer/pointer representation, cache or locality, and other physical concerns, imperative tends to do better than functional. Imperative analysis cheats, primarily based on a 'simplifying' assumption of O(1) array/pointer access.

If you do account for physical concerns, both FP and imperative will have the same asymptotic limits (though occasionally FP requires a more awkward construction, e.g. use of zippers to model locality). Coefficients matter, of course, and that's where FP tends to lag behind. Modern hardware does a decent (albeit imperfect) job of supporting the O(1) access illusion for imperative code up to the limits of RAM.

I don't see how any amount of streams and zippers can beat that elegant simplicity and efficiency :-)

I mentioned streams and zippers as alternatives to iterators, not union-find. In context, I'm not sure what you mean by "that" in the above quoted sentence.

So I am looking at something that allows small containerised imperative components to be plumbed together generically using a declarative logic language

Your vision isn't very clear to me... something like flow-based programming with wiring expressed in Prolog? In any case, your vision would make a good subject for your own blog articles.


I mean the elegant simplicity, and high performance of:

element function find (element x);
   do p(p(x)) /= p(x) -> x := p(x) := p(p(x)) od;
   return p(x);
end find;


Isn't that implementation wrong in the sense that it only updates the ancestor pointer of the current node, instead of (as it should) having each node in the ancestry chain retargeted to the canonical ancestor?

(That would say a few things about the "simplicity" of packing a loop, a test and an assignment all in a non-idiomatic single line.)

A correct ML implementation would be (but I don't find it "simple"):

let rec find x =
  match x.up with
    | None -> x
    | Some above_x ->
      let top = find above_x in
      x.up <- Some top; top

My Mistake

I can't cut and paste correctly, and I don't think any language can help me there. I missed the middle " -> x := ". Presumably without the '->' it would have been a syntax error.

It would be interesting to compare the machine code generated from the ML version and a C version.


I still don't see how your code would be correct, but it is so "elegant and simple" that I am unable to tell for sure.

If you have an ancestry chain of the form

A -> B
B -> C
C -> D
D -> E

you want find(A) to return E with an updated relation

A -> E
B -> E
C -> E
D -> E

It is clear to me why the ML version has this behavior: any find(...) call returns E and assigns the "up" pointer to this return value. The only way I see how to reason about your code is to execute it, and it gives something like

step (x=A)
p(A) is B
p(p(A)) is C
p(A) <- C
x <- C
A -> C
B -> C -> D -> E

step (x=C)
p(C) is D
p(p(C)) is E
p(C) <- E
x <- E
A -> C
B -> C
C -> E
D -> E

step (x=E):
p(E) = p(p(E)), stop

Path Halving

Its called path halving, see Tarjan "Data Structures and Network Algorithms" page 29.

"When traversing a find path, make every other node on the path point to its grandparent". It has the same performance bound as the more common version, but without the nested loop or recursive call. If you repeated the analysis by running find a second time you would see the result you were looking for.

This is the more common version:

element function find (element x);
    if x /= p(x) -> p(x) = find(p(x)) fi;
    return p(x)
end find;

But I prefer the path halving algorithm. The ML version you give is exactly this, complete with mutable reference, so I don't see the difference here? I think perhaps we actually want the same thing, but are approaching from different sides, and in any case this algorithm is an example of mutable references rather than symbol rebinding, so doesn't really support my case.

Lets see if I can state my case more convincingly: If we say that a language needs support for both mutable references and immutable variables, but I still think there is a case for symbol rebinding what Stepanov calls value semantics. This would be equivalent to an ML reference that cannot have multiple reference to the same value, so it is mutable, but assignment creates a copy. This seems to be useful in that it is more restrictive than a reference, but more permissive than an immutable value.


So it's just a different algorithm. Problem solved, thanks.

Tarjan's union-find is not the best

Tarjan's union-find is O(m * inverse-ackerman(m, n)), which is better than O(m * lg*(n)), and is the best I am aware of.

Despite big-O advantages, Tarjan's is not the fastest in practice because of cache effects. In fact, a long neglected version published by Rem in 1976 turns out to stomp all over it's supposedly superior competition in both time and space.

For a simple persistent union-find, this one is just as simple as the imperative version, and performs comparably to an imperative implementation given their benchmarks. It's dependent on a fast persistent array, which the paper also describes.


I'll have to give it a try, although in my application path-collapsing is faster as find dominates union. I agree that worst case time/space behaviour is not a very useful measure. I generally benchmark against a realistic data set for evaluation.

I don't see how performance of the persistent version can be as good as the destructive in-place algorithm. Anything involving memory allocation is going to be a lot slower doing the same amount of work. (Were they using some kind of memory allocation in the case of mutation, rather than destructive in-place update?).

I think the key is that they are looking at Prolog style systems with backtracking. For my Prolog implementation, I am just using a simple undo list, where we store the nodes to unlink. If we think about the restricted operations needed in the Prolog implementation , I think these two are the same. Effectively the persistent array operates on the current version and 'undoes' the changes on backtracking, so all it does is package the 'trail' and the disjoint set together. The persistent array used is actually a mutable object contained within an API that makes it look immutable. In the simple case where we don't want a history, this reduces to a simple mutable array. You could use this same persistent array structure in an imperative language to give value semantics, effectively optimising the copy-on-assignment. This is an interesting point, nothing prevents you using immutable data structures in an imperative language, so an imperative language is 'superior' in the mathematical sense to a functional language, in that you can implement any functional (immutable data) algorithm in an imperative language, but not necessarily the other way around.

Anything involving memory

Anything involving memory allocation is going to be a lot slower doing the same amount of work.

Not much. It's just a pointer bump after all. Their persistent array basically performs ~5 writes instead of 1 write (4 to initialize diff, 1 to write the actual array slot). However, the write cost to the array may not be in cache but the bump pointer is. Thus, the array cache miss dominates everything and the 4 writes are practically invisible for non-trivial arrays, on average. This is why naive complexity metrics can be so misleading!

This is an interesting point, nothing prevents you using immutable data structures in an imperative language, so an imperative language is 'superior' in the mathematical sense to a functional language

Yes, mutability increases expressiveness. That doesn't imply that mutability ought to be the default though, which is what you seem to be saying. Obviously pervasive mutation has its downsides, as would any pervasive sort of side-effect. That said, I think imperative languages are poorly served by the examples thus far, so I'm looking forward to seeing what you come up with!

Not using Rem's algorithm

I just noticed the persistent arrays based union find is not using Rem's algorithm, just the union-by-rank. Its also using path-compression rather than path-halving, I wonder if it would perform better with path halving? In any case there may be room for improvement.

I don't agree with your assessment of the memory allocation. Extending an array, is just a pointer bump, providing you have already reserved enough memory. Malloc is much slower as it has to follow the free block list to find a block that is just big enough. As we don't want to encourage fragmentation, most mallocs start from the smallest block and work up. Faster ones use hash-tables of block size, or exponentially sized block pools (so you never waste more memory than the size of the allocation), but not normally just a pointer bump. Yes with GC you may be able to bump the top of heap pointer, but you still pay the cost when the GC compacts the heap. It is interesting to consider that GC allows a lower call overhead for memory allocation initially, but recovers the 'cost' plus 'interest' asynchronously.

This could be another place it is losing out to potential imperative/mutable implementations as a C++ vector can be extended in place with which reduces the malloc cost from N allocations to log2(N) allocations as it doubles the reserved size of the vector every time it allocates. Obviously this kind of optimisation is only possible when you can explain to the language that you have an extendable-array of the same object. Although it may be possible to perform the same optimisation on inductive data-types, it will not always be beneficial, so there would need to be some way to indicate to the language the best optimisation strategy.

This is really my point, if you hide the memory, you will need to provide optimisation hints to the compiler so it knows the runtime use patterns of the memory. You cannot infer this statically at compile time, because ultimately that would involve solving the halting problem. My preference is for direct control, rather than indirect control through hints/pragmas.

As for what I will come up with - who knows, it could end up being ML with (the good parts of) JavaScript syntax for all I know...


Direct control is good if you have stopgaps to (statically) ensure that what you write is correct. So far none of the mainstream languages without a GC has been able to provide that safety -- except by using reference counting instead (C++), which has dubious performance implications.

I think efficiency is an important topic, but to me correctness is always rated first -- because it is what lets you be productive instead of wasting time fixing bugs. Of course for some specific purposes you have hard constraints, and that's a good reason to develop safe low-level languages. I think they should be presented as extensions of general-purpose GC-ed languages, rather than insular languages, so that we can not bother with memory for the 80% of code that don't need to.

GC implemented as library/module

I think GC should be implementable in the language, so that different libraries can use different optimised memory management. So generic GC would be in the standard library, and you pay the cost only if you want to use it. I think as much as possible should be expressable in the language itself, to remove as much magic hidden in the compiler as possible.

I agree

But I think the better solution is exposing compiler behavior to customization (or getting rid of the black box compiler) by the programmer rather than polluting the high level language semantics with (necessarily ad hoc) low level operational concerns.

Ad Hoc?

I am not sure I agree the semantics would be ad-hoc? You can start from a machine model like the random-access machine, and have a linear type system. Turing introduced memory to mathematics to enable new classes of problems to be solved - there is nothing ad-hoc about the semantics of memory. If you consider the semantics of memory ad-hoc, then the whole tradition of mathematics is ad-hoc. You certainly could not write a garbage collector without memory, and I am sure there are many algorithms that cannot be implemented without memory.

I think getting rid of as much of the black-box of the compiler as possible is a good idea, although I think the idea of customising the compiler sounds more complex than simply exposing enough to the programmer to let them write efficient code in the first place.

How are you going to

How are you going to implement garbage collection without modeling fixed width pointers that don't fit into the random-access machine model?

what's the problem?

The Schonhage model has one unbounded base address register, everything else can be bounded. The heap base address can be loaded into the address register at program start. Now we can operate only using fixed width (offset) addresses. You might fix the base address at zero and just ignore it too - the second method seems preferable to me at the moment.

Everything else can be bounded

If you work with a fixed size heap, then I don't see how you can faithfully encode algorithms that deal with objects of unbounded size. Any translation will always introduce a new possibility: out of memory. If you abstract away from real memory and give yourself an unbounded heap, then I don't see how you're going to able to assume fixed width pointers.

Out of Memory

I don't see how this is any worse than current garbage collectors, and out-of-memory is a possibility in all functional languages (and they don't handle it very well). For example I have had Haskell crash my laptop rather than just exiting with out-of-memory, and ML could quit at any point do to memory exhaustion (the side effects are 'hidden' in the function 'arrow' type).


Current languages maintain a distinction between the platonic ideal, exposed in the language semantics, and the compiled implementation which can run out of memory. If you go to Random Access machine, you get to continue pretending you have infinite memory in your high level semantics but you don't get to work with fixed with pointers and therefore might miss out on some implementation tricks. If you move to fix width pointers, you lose out on the clean high level semantics.

My thinking is that we want to keep the explicit representation selection step that currently happens in compilers, but clean it up and expose it in a nice way as part of the programming process. i.e. we start with clean high level semantics that don't know about memory at all and infer a representation at a lower level. Or maybe we infer several lower-level representations (CPU, GPU, network distributed, etc.), and get to use different garbage collection policies in each setting (since no single policy will probably make sense for all of those settings).

Random Access Machine

As far as I understand it you can work with fixed width pointers in the random access machine. You are allowed to bound all the registers except the address register. However if the address register is pre-loaded with zero, and never allowed to change, you are effectively limited to addressing the range of the bounded registers which can be used for offset addressing. So if we have 1 address + N 64bit integer registers, we get 64 bit addressing, and 64bit data and are still within the formal model of the random-access-machine.

Yes, but if your e.g. linked

Yes, but if your e.g. linked list implementation only uses 64bit registers, then it isn't a faithful linked list implementation. If your language supports formal proof, then this is a problem because you'd like to establish that your linked list implementation works and it doesn't.

I don't see the problem.

I don't see a problem with doing the proof in the additive-group of integers mod 2^n. In general if you assume the list 'link' operation takes two nodes and links them, you can give the proof assuming all the nodes are already allocated, and no memory allocation occurs during link or unlink operations. List operations on lists obviously don't need to allocate memory, so the only issue is with insert which might have to create a new node pointing to the data.

Then you implement memory transactions, so that you pre-allocate all memory required for the link operation (matching the requirements of the proof) in the begin-transaction.

    list reserve x
    foreach y in x 
        list insert y
    end for

Maybe functions could be used as transactions to save creating more syntax? Then not having enough memory would just be an exception, and you wouldn't need pre-allocation, you would still need a method of rolling-back incomplete operations however.

The problem is that 64 bits

The problem is that 64 bits is not large enough to address an arbitrary set of nodes. If you assume that it is, you can prove false.

Pre-allocate memory

If we assume all nodes already exist in memory, so that values are singleton list nodes, then we never have to do any memory allocation in any of the list primitives. By simple case analysis we can prove the handling of lists is complete, and hence there is no problem. Effectively we prove all the list operations under the assumption that all the values and lists to be operated on already fit in memory. I don't see any problems with this proof sketch, do you?

My point is just that it is

My point is just that it is inconsistent to assume that all pointers are 64 bits when there are more than 2^64 distinct nodes. The problem with inconsistency is not that you can't prove the true thing you want but rather that you can prove the false things you don't want.

What can you prove false?

What can you prove false? How is it problematic to limit the number of nodes? If a linked list consists of a 64bit pointer, and a 64bit integer value for example, then you can have 2^63 nodes in memory max. So you have N nodes where N is in the group mod 2^63.

If you assume that 1) all

If you assume that 1) all pointers are 64 bits and 2) that you have as many unique nodes as you need for any purpose already allocated, then you can reach contradiction (prove false) formally by the pigeon hole principal. This is a shallow observation and I have no idea what you mean by "you have N nodes where N is the group mod 2^63."

Pigeon Holes mod 2^63

The pigeon hole principal states that if you have N items and M containers and N > M then there must be more than one item in some containers.

Perhaps the bit you are missing is you state "you have as many unique nodes as you need for any purpose" and I state "you have N nodes where N is a member of the finite group mod 2^63". What I am saying is that N (the number of items) is not a natural number or an integer, but a member of something like the discrete-archimedian-ring of numbers mod 2^63. So we have limited N to be a member of a finite group, and just like in number-theory, we can have proofs that are about numbers and functions in that finite group. We prove by induction, so we can say for (M < N) < 2^63 we can prove our list operations correct, without having to test every possibility directly, given N < 2^63 nodes are 2 addresses (I am assuming 64-bit word addressing, but you can adjust for byte addressing with N < 2^60), and 64 bit pointers.

In fact I suspect we could prove our list operations correct for (M < N) < P, and then by induction on P prove that the list operations are correct for all memory sizes. So we can prove that if all the nodes fit in memory, the operations will be correct.

back to the start

Re: "we can prove that if all the nodes fit in memory, the operations will be correct"

This seems functionally equivalent to an infinite memory assumption.

Re: reasoning with fixed memory

Now your program has two lists and a map. How will your proposed proofs compose, if each of them have been proven correct assuming the full address space?

Assuming sufficient memory

I think you are trying to prove something else, like its a closed group - which it isn't so you obviously cannot prove that. You can prove each is correct with sufficient memory, so now all you need to do is check the assumption before you run the computation. Its just a proof under an assumption, but not infinite memory, instead finite memory P.

Assuming sufficient memory

Assuming sufficient memory to complete the operation seems functionally equivalent to assuming infinite memory. I cannot think of any example program that would distinguish 'sufficient memory' from 'infinite memory' and halt.

It is unlikely you can always predict memory usage ahead of time. For a simple example, perhaps your operation involves adding a list of strings to a trie. The number of nodes added to the trie per insert is a function of string length. The number of strings added to the trie is a function of list length. How would you go about checking this before running your computation? For a more sophisticated example, what if your operation was interpreting a script?

If you use lots of smaller checks, how is that an improvement?

Upper Bounds

I guess. There are two separate problems: Is the operation correct for all possible inputs; will this operation fit in memory. For the second it is sufficient to show the upper bounds for memory usage will fit. If whilst proving statically correct you can statically calculate the upper bounds memory use (dependent on dynamic argument data like string length and trie rank for example). Then you can know in advance if there is sufficient memory to complete the operation.

There is a simple case for in-place destructive update, where no additional memory is needed, that can therefore be proved correct with a static upper bound of zero additional memory. Thus tells you quite a lot more than assuming infinite memory.

For arbitrary values of 64?

When you write 'for all memory sizes', are you talking about making the pointer width a variable? You can certainly prove results of the form 'if all nodes fit in memory, then ...' everywhere, but the whole point of my objection is that this is annoying and an unwanted complication.

I still don't understand the significance of modular arithmetic to any of this.

Modular Arithmetic

Because you have to deal with pointer overflow. Add two pointers greater than half the address-space and you get an overflow. Computer 'ints' do not behave like mathematical integers or natural numbers, they behave like mod 2^n groups (rings).

out-of-memory is a

out-of-memory is a possibility in all functional languages

Not all of them. Funloft and SAFL would serve as counter-examples. :)

In any case, out-of-memory isn't any more elegant in non-functional languages. The last attempted allocation gets blamed, even if it's tiny or the real problem lies somewhere else. Most of the time, there isn't any sensible action but to fail.

Turing introduced memory to

Turing introduced memory to mathematics to enable new classes of problems to be solved

And Church demonstrated it wasn't necessary.

If you consider the semantics of memory ad-hoc, then the whole tradition of mathematics is ad-hoc.

Mathematics and computations do require space. They don't require mutation. How you use 'memory' seems to conflate the two concepts.

You certainly could not write a garbage collector without memory

You can model a garbage collector that does not use mutation, i.e. that instead computes the 'next' state of a machine. Doing so may even be useful. Of course, you would need a garbage collector to run the resulting machine in bounded space. There is some bootstrapping involved.

I am sure there are many algorithms that cannot be implemented without memory.

This seems like a trivial claim. As I understand it, algorithms essentially ARE the implementation. Bubblesort is an algorithm, an implementation of a sort function, and assumes a mutable array. QED.

Yet, there are other algorithms that implement sort functions, and not all of them require mutable memory. Indeed, we can infer from the Church-Turing thesis that there is no computable function that requires mutable memory.

Turing and Church

Yet Church himself said he found Turing's method more satisfying.

Further it seems somewhat convenient to not consider construction a form of mutation. Lambda calculus is mutating the stack all the time. Unless you have infinite memory, you cannot actually implement a lambda calculus without mutation.

Not to put too find a point on it

Evidently, unless you have infinite memory, you cannot implement a lambda calculus. Or any other Turing-powerful model of computation.

Easier to reason with imperative code?

I see this as the difference between natural numbers and groups. Just like in number-theory where we can operate in a group, I think one could define a "Finite-Turing-Machine". I am not sure you could do the same thing for lambda-calculus? The maths would seem much harder.

Could you say that for memory usage, lambda calculus is harder to reason about than the imperative model (FTM)? In which case one could conclude that people who claim functional programming is easier to reason about, are being selective in their definition of 'reasoning'?

reasoning about memory usage

Modulo lazy evaluation, and assuming predictable tail-call optimizations, reasoning about space usage in lambda calculus is not especially difficult. It's just a matter of knowing which values are sticking around. All those asymptotic notations for space and time still apply.

Even if you evaluate lambda calculus on a 'finite machine', I don't believe it would be more difficult to reason about than an imperative program on a similarly sized machine.

For comparison, reasoning about space/time requirements for logic programming is typically much more difficult, i.e. since search is implicit.


Only if you don't care about cost

I think GC should be implementable in the language, so that different libraries can use different optimised memory management.

A good memory manager needs to be very tightly coupled with compiler and runtime system. I don't think you are aware of the degree of interaction and co-tuning going on between all those parts in modern, high performance language implementations. And that it is pretty mandatory to achieve competitive performance.

You won't get even close with a self-hosted GC implementation. In practice, it is merely a way to ensure that automatic memory management is dog slow, libraries don't compose, and the language has to be inherently unsafe. Boehm's collector is a good example of that, although it is an impressive achievement, as far as C++ goes.

I assume you're responding

I assume you're responding at least partially to the bit about different libraries using different memory management. I'm skeptical of that idea, too. But allowing large applications (browsers, game engines, etc.) to tune their own memory policies would surely provide opportunities to increase performance.

better abstractions?

what would magically fix this? or make it at least easier to see, manage, reify, wrangle, sketch on napkins? would some new language-ecosystem architecture make it such that we can frob the knobs at a higher level? or are we inherently doomed to have people who can keep all aspects in their head at the same time, and implement it in some low level language?

No magic fix

Nothing can "magically" fix this. You have to think of GC primitives (e.g., allocation), as an integral part of the instruction set of the abstract machine a GC'ed language is compiled to. And like with all other instructions, the compiler can try to optimise certain sequences and patterns iff it knows what these instructions do, and what invariants it can rely on. One recent example from V8 is allocation folding, which is only possible if the compiler knows exactly how the GC lays out and manages its memory. (In principle, of course, you could always try to keep broadening an "abstract" GC interface to support all the optimisations you come up with. But the resulting interface won't be very abstract anymore, nor stable enough for libraries to follow.)

Memory Allocation Reflection

In this particular example (allocation folding) it may be easier, as it could be decided at compile time through some kind of memory-allocation reflection. This would be a necessary part of allowing GC to be implemented in libraries anyway.

If that isn't sufficient, optimisations could be specified in the language. So for example, you write a garbage-collector library that includes optimisations for any code that 'uses' the garbage collector. The optimisations might be tree-rewrites on the abstract syntax tree of the language for example. I would prefer to avoid making this necessary though.

Don't see it

I don't see what sort of reflection you have in mind, or how you want to be able to use it at compile time?? In any case, the presence of reflection usually only helps to make optimisations much harder, not easier.

I also don't see how you want to provide compiler optimisations through a library, while still staying sane or safe. For most languages and domains that would be a complete non-starter. I'm also pretty sure it wouldn't compose. Getting optimisation phases into the right order and have them interact in the best way is already a ridiculously hard and subtle problem in a closed-world compiler.