Interactive scientific computing; of pythonic parts and goldilocks languages

Graydon Hoare has an excellent series of (two) blog posts about programming languages for interactive scientific computing.
technicalities: interactive scientific computing #1 of 2, pythonic parts
technicalities: interactive scientific computing #2 of 2, goldilocks languages

The scenario of these posts is to explain and constrast the difference between two scientific computing languages, Python and "SciPy/SymPy/NumPy, IPython, and Sage" on one side, and Julia on the other, as the result of two different design traditions, one (Python) following Ousterhout's Dichotomy of having a convenient scripting language on top of a fast system language, and the other rejecting it (in the tradition of Lisp/Dylan and ML), promoting a single general-purpose language.

I don't necessarily buy the whole argument, but the posts are a good read, and have some rather insightful comments about programming language use and design.

Quotes from the first post:

There is a further split in scientific computing worth noting, though I won't delve too deep into it here; I'll return to it in the second post on Julia. There is a division between "numerical" and "symbolic" scientific systems. The difference has to do with whether the tool is specialized to working with definite (numerical) data, or indefinite (symbolic) expressions, and it turns out that this split has given rise to quite radically different programming languages at the interaction layer of such tools, over the course of computing history. The symbolic systems typically (though not always) have much better-engineered languages. For reasons we'll get to in the next post.

[..]

I think these systems are a big deal because, at least in the category of tools that accept Ousterhout's Dichotomy, they seem to be about as good a set of hybrid systems as we've managed to get so far. The Python language is very human-friendly, the systems-level languages and libraries that it binds to are well enough supported to provide adequate speed for many tasks, the environments seem as rich as any interactive scientific computing systems to date, and (crucially) they're free, open source, universally available, easily shared and publication-friendly. So I'm enjoying them, and somewhat hopeful that they take over this space.

Quotes from the second:

the interesting history here is that in the process of implementing formal reasoning tools that manipulate symbolic expressions, researchers working on logical frameworks -- i.e. with background in mathematical logic -- have had a tendency to produce implementation languages along the way that are very ... "tidy". Tidy in a way that befits a mathematical logician: orthogonal, minimal, utterly clear and unambiguous, defined more in terms of mathematical logic than machine concepts. Much clearer than other languages at the time, and much more amenable to reasoning about. The original manual for the Logic Theory Machine and IPL (1956) makes it clear that the authors were deeply concerned that nothing sneak in to their implementation language that was some silly artifact of a machine; they wanted a language that they could hand-simulate the reasoning steps of, that they could be certain of the meaning of their programs. They were, after all, translating Russel and Whitehead into mechanical form!

[..]

In fact, the first couple generations of "web languages" were abysmally inefficient. Indirect-threaded bytecode interpreters were the fast case: many were just AST-walking interpreters. PHP initially implemented its loops by fseek() in the source code. It's a testament to the amount of effort that had to go into building the other parts of the web -- the protocols, security, naming, linking and information-organization aspects -- that the programming languages underlying it all could be pretty much anything, technology-wise, so long as they were sufficiently web-friendly.

Of course, performance always eventually returns to consideration -- computers are about speed, fundamentally -- and the more-dynamic nature of many of the web languages eventually meant (re)deployment of many of the old performance-enhancing tricks of the Lisp and Smalltalk family, in order to speed up later generations of the web languages: generational GC, JITs, runtime type analysis and specialization, and polymorphic inline caching in particular. None of these were "new" techniques; but it was new for industry to be willing to rely completely on languages that benefit, or even require, such techniques in the first place.

[..]

Julia, like Dylan and Lisp before it, is a Goldilocks language. Done by a bunch of Lisp hackers who seriously know what they're doing.

It is trying to span the entire spectrum of its target users' needs, from numerical inner loops to glue-language scripting to dynamic code generation and reflection. And it's doing a very credible job at it. Its designers have produced a language that seems to be a strict improvement on Dylan, which was itself excellent. Julia's multimethods are type-parametric. It ships with really good multi-language FFIs, green coroutines and integrated package management. Its codegen is LLVM-MCJIT, which is as good as it gets these days.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Hope to read this later.

Hope to read this later. Just skimming, I think you really need to say more about Mathematica. In some circles it is the dominant platform. I always found it intimidating, costly, and the butt ugly. But people I respect swear by it. IPython notebooks may not be feature equivalent yet, but they allow me to use something free with a lovely community which is in the same ballpark.

I read through this, but I'm

I read through this, but I'm not sure what I was supposed to learn. Perhaps these essays are just giving us a historical perspective?

Mathematica is a symbolic language that focuses on term rewriting. So its quite different from a language where expressions are instead evaluated. I'm still not sure what epiphany they've had in making this a general purpose paradigm (via the Wolfram language).

The author didn't go near APL, which is kind of strange. It was the first interactive calculation language, I think. They even had it running on type writers...interactively.

You can do symbolic math

You can do symbolic math with something like SymPy but this is still far from ideal. I think another critical feature of Mathematica is interactive graphs. A lot of progress in IPython land on this in recent months.

Mathematica

Mathematica is a symbolic language that focuses on term rewriting. So its quite different from a language where expressions are instead evaluated.

Clarifying a bit: Mathematica's primary rewriting strategy is essentially classic "normal order" evaluation with two twists: 1) Evaluation iterates until terms stop changing and 2) There are various operators reminiscent of "quote" that prevent some or all forms of evaluations from proceeding, and interpret their arguments "symbolically".

The best overviews are here:
The Standard Evaluation Procedure
and here: Non-standard Evaluation

In practice, interactive programming is no different than any functional language, other than increased usage of explicit substitution and rule application. See Applying Transformation Rules

I'm still not sure what epiphany they've had in making this a general purpose paradigm (via the Wolfram language).

Absolutely no epiphany. They've always known that it's somewhere between workable and pretty damn good for programming, but the name "Mathematica" conveys a very limited purpose. Make no mistake, the name "The Wolfram Language" is simply a rebranding; one that appears to be quite successful in expanding the scope of those paying attention to it.

I highly recommend learning Mathematica to in-the-trenches programmers and PLT folks alike. It's heads and shoulders above any other live programming environment out there, but it's far from perfect. There's quite a bit of two-way learning that can happen.

Wrong problem

That essay needs a good editor to tighten it up.

The author's contention that you need a fast language to call from the slow language comes is flawed. Python is slow only because of the Guido von Rossum Memorial Boat Anchor - the insistence that everything must be changeable at any time, even from another thread. Python programs can use "setattr" on almost anything at any time. The compiler can never assume that an object won't change in value or form between one statement and the next. This blocks almost all global optimizations, and many local ones.

The PyPy crowd has struggled to deal with this boat anchor in their compiler. Years behind schedule, they've finally succeeded in building a JIT compiler that can deal with sudden changes to running objects. It's painful. When some object is forcibly changed from the outside, PyPy has to abandon compiled code, drop back to a backup interpreter, run interpretively until out of the area of trouble, then rerun the JIT to get new compiled code. All this now works, more or less.

Calling C from Python is promoted as the "Pythonic" solution to the performance problem. But it risks breaking the memory safety of Python. Now you have all the memory management probems of C, interconnected with the memory management system of Python. The expected security vulnerablities result.

The sad thing is that Python is potentially highly optimizable. Type inference allows using fixed types in most code, especially in numeric code. The loop structures are well suited to hoisting. But we're stuck with the boat anchor for now.

That is not the author's

That is not the author's assertion, but Ousterhout's. And even then it is a bit biased coming from the designer of TCL.

Multiple threads hinted at

I had the same reaction. There are several interesting topics hinted at, but they are mostly dropped very quickly.

I did not find the discussion of the dawn of computing very helpful. In an article about programming languages, let's just assume that we plan to have an interpreter or compiler, and go from there.

It's a good point about symbolic versus numeric computing. My impression is that numerical systems are much more widely applicable than symbolic ones. If you are working outside of pure mathematics, then you don't need a tidy closed form for whatever you are working on; you need a way to compute numeric data that are of interest for some application.

There's a rich design question hidden in the simplification down to the "Ousterhout Dichotomy". How do you fit high-performance subroutines into a general-purpose language? Talking about "C speed" is very misleading here: much of the reason to embed a high-performance sublayer is that if you write something in C, much less a higher-level language, it might not make good use of special hardware like vector processing. Also, saying that Lisp abandons the Ousterhout Dichotomy seems misleading, because Lisp is a language designed for making new languages. The way I usually think about Lisp, it feels like Tcl+C, except instead of two layers you have 10-20 layers.

It's somewhat sad to see no mention of writing *correct* scientific code. Units checking can prevent a lot of errors. As well, using an existing numeric pack, rather than open coding a new computation, has the advantage of leading to better general data-handling practices, whenever there's an existing routine to do the sort of computation you are thinking about.

All in all, many interesting topics being broached in these articles. I entirely agree that scientific computing is an interesting and highly relevant area that can be somewhat opaque given the symbolic backgrounds PL geeks often come from.

. If you are working outside

. If you are working outside of pure mathematics, then you don't need a tidy closed form for whatever you are working on

That's over-stating it. You may want to derive analytic formulas (I know I do), not just produce numeric results. But I'd agree that this might be mostly of interest to people doing scientific theory, not applied numerical computing (e.g., data science).

I think a basic distinction should be made between exploratory programming of various kinds, and justifying final results (e.g., deriving, proving). Most blog posts etc. on scientific computing focus on the former, even when reproducability and so on is being championed. So you point on correctness is well taken.

How do you fit

How do you fit high-performance subroutines into a general-purpose language?

This is a good point. I think one of the attractions of Python is that its "change everything from anywhere" approach enables really nicely done libraries. They are of course hacks, but when done right these libraries combine very nicely with the language. Think Pandas dataframes, NumPy arrays. This also allows such libraries to interface to foreign language routines that can do the heavy lifting and provide great speed advantages.

I don't see how "change

I don't see how "change everything from anywhere" leads to interfacing foreign languages more smoothly.

Suppose you want library

Suppose you want library code to be able to change the semantics of (some) assignment statements...

Pervasive Generic Methods

Is the "change" part necessary? Or just the ability to participate in the dispatch of virtually all core operations? Julia, for instance, is almost entirely built on generic multi-method dispatch (docs here). In some sense, you're mutating the function dispatch tables, but it's mutation in a much narrower sense than altering the metaclasses for the top type. Seems like it offers the same (or better) extensibility than the Python/Ruby/etc models.

Good point, and I don't

Good point, and I don't really know the answer. Maybe someone else does.

nope.

Respectfully, you appear to have read neither article fully. While Hoare does appreciate the facilities of IPython et al, he does not advocate the split between scripting and speedy languages.

You also appear to have an outdated understanding of the state of programming language implementation. It is perfectly possible to produce fast language implementations with a squirrely mutable runtime: see Java (with dynamic classloading), LuaJIT, and all of the JavaScript implementations, not to mention the more historical example of Self and the example (from the second article) of Julia. They all use adaptive optimization. Python's dynamicity is not dissimilar to Lua's.

Well, to be fair

Well, to be fair, these relatively fast implementations quickly become ludicrous complexity monsters, with 100+ man years easily sunken e.g. into a competitive JavaScript VM -- 80 of which could probably be avoided by saner language design and less ridiculous amounts of mutability. And they are still nowhere close to native performance, let alone its predictability.

+1

as an end-user, it also freaks me out that i'm always depending on ever-growing inscrutable code bases. i have *no freaking idea* what 99.9% of hotspot is doing, i assume. so "a sufficiently smart {compiler,runtime,whatever}" is very much a double-edged banana.

HotSpot is pretty

HotSpot is pretty complicated, to be sure, but the lion's share of that is in the server compiler and the GCs, IMO. V8's complexity is more heavily centered in the runtime and the complexity of the object model, ICs, and compile-only architecture. JavaScript is just flatout an insane tangle of complexity of interactions--mutability and indiretion requiring speculation and backout and shortcuts on a level undreamed of in the Java world.

Who you calling a ludicrous

Who you calling a ludicrous complexity monster? :)

Actually in recent times I've started to think that V8's complexity is at least 75% incidental. I'd say why but I don't want to be shot.

If

If you mean driven by incidental picks of random benchmarks that bear no relation to reality but are what everybody is gaming, then yes, I agree. Although I still think your 75% number is far too high. Maybe half of that.

You're back in Europe now, nobody will be shooting. :)

Lancet & "Surgical Precision JITs"

Lancet has gotten me excited about the possibility that you're right: contemporary JIT implementations are mostly incidental complexity. I'd love to see more work in the direction of Surgical Precision JITs.

Mostly orthogonal

Well, this is nice in that it gives the user a way to control particular optimisations. But by itself, it doesn't really simplify much if the goal still is to be fast on common code without requiring user intervention. (In fact, the heavy reliance on partial evaluation, and the meta-circularity aspect that goes with it, can easily make a runtime even more complicated, though perhaps less ad-hoc.)

Yes, it's painful

While it is possible to produce fast implementations of languages with a "squirrely mutable runtime", it is not easy. It's been done for Javascript because it was worth the trouble to devote huge resources to the problem. (Javascript also lacks shared data concurrency, which eliminates many of the painful cases.) Julia's model of concurrency is remote procedure call, not shared data between threads. Much simpler from a locking standpoint.

Google's project to speed up Python, "Unladen Swallow", was a total failure. The PyPy project has struggled for a decade, is five years behind schedule, and still isn't either comprehensive enough for widespread use or fast enough to eliminate the need to do array math in C.

It's not fundamentally impossible, but it's demonstratably very difficult.

Necessary division, but may be possible in one language

I think the division is necessary. Firstly I don't think compilers or JITs will ever produce code as good as domain experts. I think the idea that you can write high level code and leave it to the compiler to optimise, is in the general case wrong. You cannot optimise well for all use cases. What makes more sense to me is that domain experts can write very fast and efficient libraries (perhaps implemented as optimising domain-specific-languages) for specific purposes. To this end I think a low-level language with control over addresses, bits and bytes is necessary, with an execution model similar to the underlying processor. However, you also want a way to use these low level highly optimised components together, so you need a powerful abstraction mechanism. My preference would be parameterised generic components, implemented with parametric polymorphism.

I am not yet decided whether a single language or two languages (in the same compiler) would be better. All in one would require something like linear dependent types, whereas two languages would be a 'C' like low level with Prolog like high level. I think it is clearly easier to produce a working prototype of the latter that meets the goals - compiling C to fast code is solved, and the componentisation can mostly be inlined out or replaced with a function call, or C++ style virtual method call for static and dynamic polymorphism respectively. The former seems a much harder task to get something practically usable, but would seem to offer the neater language concept.

Possible in one language?

Not sure.. In theory D is usable with or without the GCs, in practice a large part of the standard library use the GC so..

There's the same issue with checked vs raw int, if you recompile everything it's not an issue but as soon as you want to reuse binaries..

Constraints

It would be quite easy to enforce a non-GC subset of the language inside low-level components, preferably using the type system. The key is the language must be able to express concepts like memory layout, allocation and deallocation, so that you can use them where GC is not allowed. Conversely you might want to restrict those features to inside low-level I components. Decisions as to the precise nature of restrictions can be deferred providing both the language and type system are capable of expressing the required high and low level concepts in the same language.

Not really the problem

The issue isn't really technical but human/organizational if you have a language which allow the use of a GC, developers will use it: so in the end you have a standard library which use the GC which makes not using the GC unpleasant..
As writing a standard library takes many man year and is a very 'distributed' effort, it's difficult to force developers to not use the GC..
Sure tools and type systems helps discover code which use the GC where it shouldn't, but the main issue is that not using the GC is important only for real-time development which is a minority of case, so it's difficult to require everyone to take care of issues which doesn't affect them directly..

Nobody forces Python devs to use C

To address the points, if developers don't need high performance they are welcome to use the GC, this is not a problem.

A non-GC component would look like a GC component from the outside. So for example lets say I write a high performance matrix library that manages its own low level memory, from the outside I would use it just like a matrix library that uses GC, except it would be much faster. It may be that you ship with a slow high-level matrix library, and replace it with a low-level high performance one later. None of the code using the library need change providing they use the same API definition.

There is no using-GC where you should not. The type system can rigorously enforce it, so if you declare a component is 'low-level' you will not be allowed to use GC inside the component definition.

There are plenty of applications that are not real-time, where performance is an issue. For example compiling a program faster makes a developer more effective. Improving performance of libraries can reduce required hardware costs, or even make some application practical, that was previously too slow to be any use.

Hm

Doing that in a safe way, checked by a type system, is known to be feasible but maybe not as easy as you make it seem. To my knowledge ATS is the only language doing that -- Cyclone has used regions to support some, but not all, of the usual C memory ownership patterns. Chlipala's Bedrock verification framework also knows how to deal with allocation, but it's a heavier approach. None of those count as "quite easy".

Android NDK

One way to look at this is that it is similar to the Android NDK, communicating between a garbage collected language (Java) and C/C++. The marshalling is complicated on the C/C++ side because these languages have no awareness of the Java stack and heap formats.

A simple solution would be to start with a common basic syntax (operators, functions, blocks etc... having two just does not make sense), and add garbage collection on one side, and memory primitives on the other (like Java and C++ respectively), except they would share a common object and stack memory layout, and each would be aware of the marshalling requirements for the other, so the programmer would not need to explicitly code this.

This would not even need the type system to control as to separate sub-languages exist, but I would prefer something enforced by the type system.

Maybe Cython/RPython is the

Maybe Cython/RPython is the closer thing?

I think is necessary that 2 different things look different, so is clear in what universe I'm. If my "low-level" and "high-level" read equally, then Is hard to switch context.

Two Syntaxes Better?

If developing a new language from scratch would you really want to include two distinct syntaxes? Surely the differing use of operators would be confusing? Say one used () for functions and [] for arrays, but these are swapped in the other. Would this really help you write better code?