wondering why C is the language of compilers- when a Scheme subset would seem to be a better fit?

I was just listening to episode 57 of Software Engineering Radio ( http://www.se-radio.net/transcript-57-compiletime-metaprogramming )
I'm only 40 minutes in, but I'm wondering why C is the language of compilers- when a Scheme subset would seem to be a better fit?
(excluding the obvious reason of not wanting to rewrite gcc)

Please forgive my ignorance in this,

Stephen

C is the language of compilers?

I expect most LtU readers write compilers in functional languages, and certainly never C. Even in the mainstream, "scripting languages" are probably more popular. All this is for counting everything anyone writes that could be called a compiler; the most popular mainstream language implementations may very well be focused on C implementation, but that doesn't mean the people who chose C for them knew what they were doing. :)

By Adam Chlipala at Tue, 2010-01-05 14:25 | login or register to post comments

If I remember correctly

The early GCC compiler was written in a C-stylized version of Lisp?

By marco at Tue, 2010-01-05 15:41 | login or register to post comments

re early GCC

I think you don't remember entirely accurately. Before there was GCC there was a compiler in Pascal (by someone else) that inspired the design of GCC. GCC itself was written in plain 'ol C (for it had to be bootstrapped via proprietary C compilers).

If *I* recall correctly, GCC used to work roughly this way (and much of this structure may or may not still remain, I don't know): front-ends translated all input languages to a generic tree form with some optimizations then performed on those ASTs. Those were then translated into a 3-address register transfer language (RTL) with additional optimizations there. Finally, the RTL was pattern matched and peephole-optimized against a machine description language (MDL) to generate actual machine instructions (well, assembler).

Much of the code and data structures were very influenced by "lisp style".

What you are probably remembering is MDL - the machine description language - which was given a lisp-like syntax and operators. You'd port to a new chip by writing the patterns and rules for how to match RTL and what instructions to generate for a match. (For all I know, that hasn't change much but I'm not sure.)

It'd make some sense to say that the architecture-specific parts of GCC were written in a Lisp-stylized subset of/front-end to C, if you don't mind splitting hairs.

By Thomas Lord at Tue, 2010-01-05 20:27 | login or register to post comments

TeX Validator & Converter, trivial in a functional language

Tomasz Wegrzanowski wrote texvc (TeX Validator & Converter) utility for MediaWiki in OCaml. When a Microsoft fanboy asks me a question like, "Why use F#?" I point them to this source code.

Every imperative object-oriented language fanatic I've ever met has been amazed seeing how easy it is.

By Z-Bo at Tue, 2010-01-05 17:57 | login or register to post comments

C has advantages and disadvantages

Some decades back, I wrote a CL in C. C was instrumental in writing the RTS, particularly the GC.

Some years before that, in the relatively early days of SQL, I wrote a report formatting language targeting tbl, troff and ultimately a Post Script printer. In that case, with no need to implement a bit twiddling RTS, awk performed admirably for my needs at the time. We needed C only to interact "generically" with the Sybase RDBMS via just a few utility programs.

Currently, I'm writing what is turning out to be compilers for a series of successively "complex" languages using PLT Scheme - mixing the "regular" latently typed Scheme language with the "typed Scheme" manifestly typed language dialect available in the PLT system. The target is ia32 assembler supplemented with calls into a partially C implemented RTS component. C is handy here - aside from my own calling convention, copying GC, etc., I have no desire to re-implement the malloc algorithms for large objects, FILE* oriented buffered i/o and so on and so forth.

At some point, a language in the language series might leave C behind for the RTS implementation, with P.J. Plaugher's book on the C standard library algorithms, among several other algorithm texts, in hand.

But even then, I fully anticipate the necessity for some system akin to (or even) SWIG in order to (relatively) easily access the multitude of external libraries typically implemented in C or at least accessible only via the C or Pascal (or Std on Windows) calling conventions.

If some of our forefathers coded on Smalltalk or Lisp machines, it's not too much of a stretch to claim that today at least most of us target our work at what we might very well call "C machines."

By scottmcl at Tue, 2010-01-05 15:51 | login or register to post comments

Unix targets

For compilers with the two goals: 1) Run on most every Unix variant and 2) bootstrap the compiler from source then the C language was the main choice because C was the bootstrap language known to be available on those targets.

By Josh Stern at Tue, 2010-01-05 16:46 | login or register to post comments

good point, stupid C stack (re "unix targets")

Yes. C is so often chosen as a target language because (until recent things like LLVM) it was the only easy way to target lots of machine architectures quickly and with half decent low-level optimization. It's disappointing (to me at least) that C never acquired de facto standard ways to (a) make at least simple optimized tail calls; (b) reflect at run-time on the contents of the stack. Even if GCC alone had added such features perhaps 15 years ago, when the need was already widely and painfully felt, it would have been a huge win. (Perhaps it is not too late.) (Yes, I'm aware of GCC's limited support for TCO but that's a little "too simple".)

C's stack discipline is the only big annoyance when using it as a back-end. That's been true and recognized for ages but somehow never gets fixed in a way that takes off and is widely adopted, in spite of numerous valiant attempts and some good polemics asking for it (e.g., Boehm's).

Another "woulda-coulda-shoulda" solution that would have been great would have been to have GCC support an RTL (intermediate form) front-end and compile programs directly from RTL. We could have had something akin to a static version of LLVM years and years ago. Alas, during the critical decade when this would have been a very progressive idea, technologically, the GCC maintainers were refusing on strategic principle to merge and support any AST or RTL front-ends to GCC because that would have made it too easy, in their view, to write proprietary front ends for HLLs that took advantage of GCCs free software optimization passes. (It would have made it easy to end-around the GPL because you wouldn't have to link your front end with GCC, just invoke GCC in a "pipe".)

Isn't that strange? For the aim of protecting software freedom, the GCC project - for more than a decade - refused to make it easier to use GCC as a generic end-stages compiler for HLLs. The technical opportunity was there - but the perceived political costs too high.

(In retrospect, I think that decision was dumb but hindsight is 20-20 and I also doubt that even today everyone would agree with me that the decision was dumb.)

By Thomas Lord at Tue, 2010-01-05 20:51 | login or register to post comments

Isn't that strange? For the

Isn't that strange? For the aim of protecting software freedom, the GCC project - for more than a decade - refused to make it easier to use GCC as a generic end-stages compiler for HLLs. The technical opportunity was there - but the perceived political costs too high.

Not strange... but it creates confusing jargon. For example, the Open64 compiler positioned itself as a research friendly compiler project. What exactly is "academic friendly" or an "open research compiler"?

When I weas in undergrad, I could not possibly understand what abstractions defined the term "open research compiler". As I graduated, it did come into focus for me.

By Z-Bo at Tue, 2010-01-05 21:53 | login or register to post comments

Portable ABI infrastructure never developed

C and C++ progress has been limited by the decision to keep treating ABIs as something considered to be unchanging, platform specific and mysterious. This shows up in too many ways to count, but a few
examples:

1) Your examples of lack of stack control and reflection
2) C++ name mangling opacity
3) C++ static initialization order fiasco
4) C+ template "template export" fiasco (generally not implemented part of the 1998 standard)
5) Visual Studio 2008 and earlier versions do everything they can to make STL containers too difficult to use in DLLs.

Stroustrup struck some sort of Faustian bargain where Mephisto gave reuse of a lot of exiting backend software in exchange for creating a language that assembles translation units instead of complete programs.

LLVM seems kind of exciting, but they are not writing linkers either.

By Josh Stern at Tue, 2010-01-05 22:55 | login or register to post comments

LEX and YACC

Having tools to write the boring bits (the parser) is a kickstart that other languages was slow to offer.

By geira at Wed, 2010-01-06 13:46 | login or register to post comments

Proposed in 1994

...had to wait 'til I had some free time to dig it up as I could not remember the authors or title of the paper, and initial hits on Google were fruitless, but:

A New Architecture for the Implementation of Scripting Languages by Adam Sah and Jon Blow, 1994.

Nearly all scripting languages today are implemented as interpreters written in C.We propose
an alternate architecture where the language is translated into the dynamic language Scheme
[R4RS]. The plethora of high quality, public domain Scheme implementations give the developer
a wide selection of interpreters, byte compilers, and machine code compilers to use as targets for
her VHLL.
Our VHLL, Rush, provides high-level features such as automatic type conversion and production
rules [SHH86][Ston93]. Performance benchmarks show that our system runs with acceptable
speed; in fact, Rush programs run much more quickly than their equivalents in languages such
as Tcl and Perl4. Whereas those languages are coded in C, Rush takes advantage of Schemeâ€™s
existing high-level features, saving development time. Since the features provided by Scheme are
among those most VHLLs share, we expect this approach to be widely applicable.

Also, I think the CLR's MSIL + DLR extensions provide a good back-end for this stuff. So does the JVM. Actually, you can write a dynamic language with more efficient bytecode interpretation than C#! C# lacks syntactic-goodness for hashes that give the compiler writer easy clues on how to use features like property indexers.

By Z-Bo at Tue, 2010-01-05 21:40 | login or register to post comments

on multiple fronts in 1994

1994 was also when we said, for the GNU project, that the nascent Guile Scheme would be the default scripting / extension language for the GNU operating system. We proposed to translate other languages, such as Emacs lisp and Tcl, to Scheme.

One of the contributions of the Rush project was that it laid bare some warts in Tcl's semantics (having to do with scoping and "upvars" as I recall) that, if repaired, would have let a Tcl->Scheme compiler be both simple and powerfully effective. Fixing those warts would have broken but a small percentage of the extant Tcl code we could see, but meanwhile, Tcl/Tk was already deployed in mission/life-critical uses like running control panel GUIs on oil rigs and Ousterhout wasn't about to fix the warts.

cf. "The Tcl Wars" (declared after Sun Microsystems told the world that Tcl was to become the universal scripting language) and the mediocre but perhaps interesting paper from VHLL 1995: An Anatomy of Guile [...] by some blow-hard or other.

Twixt '94 and '95 or so I had the privilege of meeting Sah and Blow and working a little bit with Blow. By the time of the '95 paper, given the *perceived* pragmatic tactical needs of the GNU project and of Cygnus Solutions Inc., we had backed off translating Tcl to Scheme anytime soon and concentrated on marrying the two run-time systems as well as possible. This was seen as a temporary compromise and gave some temporary satisfaction to my pointy haired boss at the time but was largely a huge waste of effort.

A lesson I took from that period is that translating your favorite language ideas to Scheme actually is a very good way to implement a wide range of HLLs / "V"HLLs -- but that it works best if the HLL/VHLL you are translating to Scheme was designed in the first place to be consonant with Scheme's basic types and semantics.

By Thomas Lord at Wed, 2010-01-06 00:41 | login or register to post comments

but that it works best if

but that it works best if the HLL/VHLL you are translating to Scheme was designed in the first place to be consonant with Scheme's basic types and semantics.

I wonder what that really means. In particular, JavaScript was intended to do this (with, as sugar, Self's object system). However, you'd never get that impression if you looked at Sergio's JavaScript semantics. It gets closer with Arjun Guha's reexamination... but there's a *lot* of sugar and I think he took a strict interpretation for the initial work. Does that imply even if we want to design to be consonant, we really do need to use mechanized or semantics-based approaches from the get-go? Probably similar lessons from the Haskell, ML, and R*RS communities.

By lmeyerov at Wed, 2010-01-06 02:19 | login or register to post comments

meaning of "consonant with scheme"

What I mean by "consonant with Scheme" is pretty (deliberately) naive - simple-minded. Purposefully unsophisticated:

I mean that to express the design of your language in the first place, you pick a widely agreeable dialect of Scheme and you express the meaning of programs in your new language by virtue of a simple translation into that dialect of Scheme. If you do the job well, the translation you offer should both (a) have useful enough performance for many real tasks, even if you can later get better performance using other techniques; (b) play nicely with code written in Scheme and with code in other languages translated to Scheme in a similar way.

For example, whether or not you believe in exactly R6RS numbers or not, you should have some notion of what a number is in Scheme and then stick to that in all the other languages that you translate to Scheme.

As for formal methods, hopefully your chosen definition of Scheme has a nice clear operational and denotational semantics. By defining your language by translation to your Scheme, you leverage that.

You say that "Javascript was intended [to be consonant with Scheme]" and that is news to me. Were that true, I would have expected a different treatment of numbers, the presence of TCO, code as data (not as source-code-in-strings) etc. Can you elaborate on how Javascript was intended to be consonant with Scheme as opposed to just borrowing from Scheme the notion of first-class environments as a way to conceive of Self-like prototype-based objects? (Wow, that's a mouthful! It's already way too complicated! Just show me the easy to think about translation to (a recognizable) Scheme, instead -- that's simpler and then I'll believe Javascript was designed to be consonant with Scheme.)

By Thomas Lord at Wed, 2010-01-06 03:54 | login or register to post comments

http://weblogs.mozillazine.org/roadmap/archives/2008/04/popularity.html

Brendan Eich would be a better person to talk to about this, but judging by this and related posts and my (very limited) interactions with him, a lot of the pain that the ECMAScript committee has been going through in the past few years would not have happened if you sent him back in time with all his gained knowledge in PL since those formative days. As for Scheme-like, it is dynamically typed with closures, lexical scoping, and GC; more important is what it is missing that was popular at the time (Lisp, Java, C/C++, ML, Perl, and more tiny languages: pointers, static types, overloading, most reflective abilities, system access, multiple inheritance, dynamic scope, ...). Also interesting to see what was dropped from Scheme (macros, continuations, TCO, quoting) and added in (setters/getters, coercions, partial stack access, self-style OO, etc.). Also fun to see the impact of modifying or innovating on language features from other systems (e.g., see Caja's list of JS vulnerabilities or 'fixes' from the various standard versions).

I think the semantic benefit of the translation approach is achieved when you mostly leverage Scheme, e.g., for lexical scoping. As you build up auxillary data structures (e.g., for OO), new features will be in terms of them (e.g., fancy Self or Lua scope tricks), and you lose the benefits and converge into typical pitfalls of designing a language by randomly extending a compiler or interpreter (probably worse in the latter case, as seen in Python/Ruby/etc.).

Heck, I don't see why a semantics-driven (even mechanized) approach won't ultimately lead to the same trap. Perhaps it's more of a question of 'when', not 'if'.

By lmeyerov at Wed, 2010-01-06 06:57 | login or register to post comments

javascript v. scheme

But, the post you pointed me at makes it fairly clear that JS was not intended to be consonant with Scheme but merely to borrow some ideas from Scheme. That's different:

Whether any existing language could be used, instead of inventing a new one, was also not something I decided. The diktat from upper engineering management was that the language must "look like Java". That ruled out Perl, Python, and Tcl, along with Scheme. [....]

I'm not proud, but I'm happy that I chose Scheme-ish first-class functions and Self-ish (albeit singular) prototypes as the main ingredients. [....]

[....]I still think of it as a quickie love-child of C and Self. [....]

As an aside, the story Eich tells of Javascript's creation is personally mostly heartbreaking:

A bright spot is this "Later, in 1996, John Ousterhout came by to pitch Tk and lament the missed opportunity for Tcl." There is a back-story to that:

Early on in the life of Java there was a kind of "buzz" in the industry, and in Silicon Valley in especially, that "scripting languages" were sure to be economically important. Executives and high level managers had noticed that Tcl/Tk vastly reduced the cost of building many kinds of GUI, compared to, say, using the Motif toolkit. They noticed some early Python success stories. In some ways, this is when the "AI Winter" started to thaw and higher-level languages gained credibility in business. They weren't, however, particularly thoughtful or critical of what might make one language better, in the long run, than another in this or that situation.

Sun, at that time, hired Ousterhout and, as I mentioned elsewhere in these threads, announced to the world that Java was the programming language of the future, and Tcl, its "universal scripting language". That was, for a time, their apparent corporate vision: your browser, your servers, your cell phone, your TV - everything - would be programmed in Java with Tcl available for scripting.

Fans of other languages around the world beat back pretty soundly the notion of using Tcl. The GNU vs. Sun "Tcl Wars" were almost a kind of epiphenomenon on the wider-spread, grass-roots rejection of Tcl as a language with a bright future. Almost every scripting language in those days made a point of teasing Tk apart from Tcl and so we got Perl/Tk, Python/Tk, Guile/Tk, etc.

First sign of how crazy things were, though: Were it not for that pushback, given that Sun and Netscape were negotiating to get Java in the client, we might all be using Tcl instead of Javascript in the browser!

Second sign of how crazy things were is where Eich writes: "The diktat from upper engineering management was that the language must 'look like Java'."

Can you imagine? What does that even mean, "look[s] like Java"? In this context where Netscape started off contemplating Scheme, it probably meant little more than "Not one them parenthesis thingie languages. And don't call it Lisp!" The social circle of executives that included Sun had convinced themselves that Java was The Next Big Thing which led to a lot of technically naive "diktats" like "has to look like Java". That was about the level of critical thinking I encountered in must upper management and executives I interacted with. That is why Guile came out with a C-like syntax that same year. That is why there was a (later abandoned) project to add a JVM-compatible byte-code interpreter to Guile. We didn't actually need JVM compatibility for any practical reason but it would, allegedly, give Guile a "Java story". All projects with the word "language" in the description had to have a "Java story". "AI Winter" was thawing, but there was a lot of sloppy slush on the ground, conditions were icy, and a bitter, mindless wind still tore through the Valley.

Heck, now you know why it got dubbed "Javascript" in spite having no deeper relationship to Java than a vague resemblance.

Particularly heartbreaking to me is the list of names in and around the birth of Javascript. I'd met just about everyone on that list and worked with a few of them - but was not in touch with any of them at the time. Yet the executives and managers at my firm, where I was working on Guile, did have closer ties at Netscape. And so here at Cygnus we were working on an extension language library, built around a Scheme run time system but quite capable of hosting a language that "looked like Java", and somehow the connection never got made.

Internally to the firm I worked, there was a kind of mindless backlash against Scheme and in favor of, of all things, Tcl. The project lost resources and I was pressured to resign. While new releases of Guile continued after I left, the Guile project itself languished, with the initial goals essentially abandoned.

In short, we came quite close to an alternate future in which browser clients came with a Scheme engine, probably with an optional C-like syntax. We missed out because of executive-level hype about Java and Tcl. And we missed out because of the Silicon Valley approach to software strategy which Eich sums up nicely: "What was needed was a convincing proof of concept, AKA a demo. That, I delivered, and in too-short order it was a fait accompli."

Bummer.

By Thomas Lord at Wed, 2010-01-06 22:23 | login or register to post comments

engineering management was

engineering management was that the language must "look like Java". That ruled out Perl, Python, and Tcl, along with Scheme. [....]

I read that as a syntactic constraint: it had the same intent as Guile except instead of being a translation into Scheme, he rolled his own interpreter. The syntactic restriction explains why macros weren't there, and rolling his own suggests why we had the weirdness with hoisting. The lack of continuations follows from his claim to have just seen SICP (which I interpret here as the general Lisp/Scheme approach); there's a big leap from being able to use manipulations to appreciating them (and another to understanding implementation implications).

This overall interpretation seems consistent and supports my thought that designing for the translation approach, when ad-hoc, isn't really enough. (Or, perhaps, the first attempt is never right?)

As for getting a 'good' language into the browser, a JS interpreter or source-to-source translator is often fast enough and I know many folks who don't write raw JS (including in the commercial / startup world). If you have an idea, the only barrier is yourself :) In contrast, I'm currently interested in the more performance sensitive mobile space, where the current JavaScript+DOM approach is too expensive, and even more so as a compilation target. Stuff like Sean's Bling is more appealing as it could actually use the hardware. There's a definite void here for a new base language.

By lmeyerov at Thu, 2010-01-07 00:23 | login or register to post comments

timeline and javascript+mobile

I think you are misreading there a bit. As I recall, Nick (the guy that introduced Eich to SICP) had left SGI some years before these events. Eich had almost certainly encountered SICP a few years before the events described. I do not disagree, however, that there's a big leap from knowing how to use a thing and appreciating it (though I make no claim of where Eich was or is on that spectrum).

I think you are overly generous to the (stereotypical) executive or high level manager by interpreting "looks like Java" as a syntactic constraint. To be sure, I think part of the message there was the (ignorant, pointless) syntactic constraint to use curly braces and algebraic notation rather than s-exps but, as nearly as I can tell, the only "principled" reasons they had for imposing such a complaint were (a) to express dominance over the hackers; (b) because their executive and management friends shared an anti-lisp superstition; (c) because the Java hype in the executive/mgt. class was thick as molasses at the time and they wanted a "java story". In other words, they were "bike shedding" (as in, arguing over the unimportant decision of what color to paint the bike shed instead of making strategic choices about the structural engineering of the bike shed). You know, when Steele et al. start coming up with their (fairly radical) syntactic choices for Fortress, there is some pretty deep experience and principle behind that. When an exec in 1995 is saying "no parens!", he's much more likely to be responding to rumor and innuendo.

I don't think "ad hoc" language design of a high level language by translation to a very minimal yet general high level language like scheme is "enough" either, but I think that what you need in addition is most often just some experience-based good taste and a strong dedication to K.I.S.S. (at least most of the time).

Regarding source->Javascript translation and such: yes, I'm aware of at least some of those hacks. All I can say is to express my opinion that they would be simpler, and there would be more of them, were the target language at least much closer to Scheme. It's a struggle to translate in the face of Javascript's complexity and quirks in a way that would be less likely in the face of Scheme's simplicity and generality. At the same time, Javascript makes a lousy target language for compiling Scheme.

Penultimately, regarding the mobile space vs. Javascript+DOM (and another lost opportunity from the Guile days): There is a lot of headroom to improve DOM implementations for complexity, space, and time - and to identify useful subsets for limited capability devices. I doubt DOM is your real problem - just current DOM implementations. And don't you wish you you could get by with a tiny Scheme rather than having to implement javascript?

Lastly, if it were not for some really needless quirks in Javascript I suspect it would translate handsomely and easily to Scheme in a way that let it inter-operate smoothly with native Scheme code. For mobile, if you really wanted to run javascript code, you could do it via such a translation. Alas, back in the real world, it won't work cleanly because Javascript has strange notions about things like numbers and strings.

By Thomas Lord at Thu, 2010-01-07 02:11 | login or register to post comments

Does that imply even if we

Does that imply even if we want to design to be consonant, we really do need to use mechanized or semantics-based approaches from the get-go? Probably similar lessons from the Haskell, ML, and R*RS communities.

Sort of.

There are similar lessons to be learned from large-scale object-oriented projects like the Java Virtual Machine and its lack of upfront design leading to poor modularity (see the Android fork and dismemberment of Java ME as examples). You can't embed the JVM in small devices that easily. A goal of the Mono project is to build a "build your own framework" platform where you can use Mono+your app as an "appliance", dynamically reshaping not just assemblies but also the VM itself. This half is not exactly something you mechanize, but rather depend on good design decisions for.

So the solution to good system's design is half math, half intuition. You can still check your intuition using PLT concepts like dependency structure matrices, though. Those are based on hard, simple math (linear algebra).

By Z-Bo at Wed, 2010-01-06 15:14 | login or register to post comments

I'm finding your comment

I'm finding your comment very hard to decipher.

This half is not exactly something you mechanize, but rather depend on good design decisions for.

What is the lesson of Mono? Because they had a design goal, they can achieve it? Or is it because they actually tested on embedded systems early on, while the JVM diverged?

Those are based on hard, simple math (linear algebra).

There isn't much math to what they present (and this sort of analysis is much 'softer' than what we expect from typical compiler ones): the interesting part is that you can tolerate dependency cycles when on the diagonal for some ordering. This is a common trick however: compressing cycles to nodes to get an acyclic form and then doing something tricky on those hard cases (in their paper, nothing). The hard part is applying it to software... but they do it for a modeling language. This gives them room to experiment. E.g., we can interpret some operators -- multiplication might represent sequencing or composition, while addition is some sort of priority -- but this stuff has very unclear benefit and they don't do any of it at all.

Of most concern, I have no idea where you're going with referencing it and what it has to do with designing a language. The language's implementation, or perhaps its features? Why are cyclic dependencies bad? Etc.

Edit: I also wanted to write that paper is not about PLT concepts.

By lmeyerov at Wed, 2010-01-06 20:31 | login or register to post comments

Drawing Hands

I'm saying that when subsetting a project, it is extremely beneficial to remove unnecessary dependencies.

The paper itself is not heavily PLT-related, and I suppose if you wanted to be strict, such a paper is much better classified as "Quality of Software Architectures" and might better fit at the QoSA workshop. [Edit: Apparently, a QoSA 2009 paper by LL Bean, Achieving Agility Through Architecture Visibility explains how they use Lattix's DSM tool.]

I do think that when you build large libraries and include dependencies to glue stuff together, that dependency structure matrices can illuminate problems that are hard to see in the small. Most PLT research does not focus on seeing large-scale partitioning problems, at least not that I'm aware of. e.g., how do you go from a non-modular solution to a modular solution?

Think of WPF, the mother of all cyclomatic dependencies. Everything in WPF leans back on itself like Escher's Drawing Hands. This is literally due to how information flows through WPF and how WPF resolves dependencies. Had the designers realized this and used a tool like dependency structure matrices, they might've realized they should factor out the dependency subsystem into its own assembly, share it with Workflow Foundation and Windows Communication Foundation. As it were, Microsoft -- across all .NET 3.0 foundation classes -- has had to create GoF bridges to solve the poor abstractions and serve as gateways between the supposedly "foundational" systems. Shouldn't foundational systems talk to each other easily? I feel usability is a tool to educate good design, and PLT often makes strong inroads to practitioners when they can show "this doesn't work, here's why, here's a solution." One example would be Felleisen and Quinneic's papers on using continuations for "conversational style" web programming. What those papers show is how you can eliminate certain problems with treating web frameworks as pipelines by mapping a user session to a server-side cache. At the same time, you remove the need for the programmer to write a lot of boilerplate Data Transfer Objects to shuffle code between layers.

By the way, for .NET, a company actually sells a tool to help with DSM analysis. NDepend; it also comes with a language for querying .NET code. There is also Lattix, which is the company represented by half the researchers in that OOPSLA paper. I tried Lattix and I didn't like it. Also for TSQL, it required a lot of manual rules on my own part.

Why are cyclic dependencies bad?

Cyclic dependencies are bad because they create yo-yo APIs where objects ping-pong back and forth messges. Rather than understand the Context of the interaction, and use that context to broker the message passing, the object's communicate directly, and in doing so are effectively going to share each other's implementation details to get things done. WPF is a prime example of this. e.g. the design of things like TemplateBinding "to improve performance" is directly related to the fact of how templates are allowed to be resolved internally; it is simply too expressive (it allows for cyclic constraints). The very idea of TemplateBinding is to trim the search space for a dependency by introducing a cut operator.

As an aside, I realize many PLT researchers dislike my equivocation between systems and PLT. Even functional programmers dislike my equivocation. I've written discussion forums posts on Google Groups that have been voted "1 star", simply because I wasn't using terminology they were familiar with. I think this is out of ignorance, and not understanding that well-educated perspectives of a problem domain lead to interesting applications for "mechanized metatheory" and other disciplined analysis & synthesis. We're actually starting to see this now with Software Configuration Management systems. 15 years ago, it was impressive to describe the taxonomy of SCM solutions and what every SCM has in common and how they differ. Today, it is more interesting to automatically generate an efficient SCM solution from a model description, customized by model parameters an individual person needs.

Of most concern, I have no idea where you're going with referencing it and what it has to do with designing a language. The language's implementation, or perhaps its features?

In my mind, the first paper that discussed a layered approach to system's design was Dijkstra, in The Structure of the 'THE'-Multiprogramming System. In this paper, Dijkstra made his landmark contribution to operating system design, where he showed how layering various concerns could achieve better reuse and system stability. Today, some of his ideas are invalidated, but all modern OSes are still based on his ideas, even though PLT now has potentially better ideas. It's worth noting Dijkstra also coined the term "separation of concerns", which is now strongly linked with Kiczales' aspect-oriented programming. Although, even before that, Scott-Strachey were seeking a more basic understanding of these issues.

What is the lesson of Mono? Because they had a design goal, they can achieve it? Or is it because they actually tested on embedded systems early on, while the JVM diverged?

The lesson of Mono and NekoVM is dependency inversion. Dependency Structure Matrices tell you whether or not you're actually applying successfully Dependency Inversion Principle. Just because the ideas of DSMs started off in building architecture (Christopher Alexander) and production line management (Don Steward) shouldn't exclude their use in the modular definition of systems. From there, this concept was really promoted by Michael Jackson, who took ideas from compiler architecture in the late 60s and early 70s to come up with what was then known as Jacksonian Inversion. The PLT aspects of Jacksonian Inversion were NOT lost on PLT researchers, most notably CAR Hoare who compares CSP to JI!

Now, a DSM is not a perfect way to check a result, just as being given f'(1)=0 is not a perfect way to check differentiation for F(X)dx. More foundational ways, based on proofs, exist. However, that is very heavy, and for the sort of control partitioning needed in most systems I work on, unnecessary.

[Edit: I fixed bad links. Sorry for carelessness.]

By Z-Bo at Wed, 2010-01-06 22:42 | login or register to post comments

I still can't see how to

I still can't see how to map, in a concrete sense, a notion of dependency, to language design. Your example was about API design.

As for systems vs PL, both have different (and sometimes overlapping) principles. E.g.,, a PL can be viewed as a system. They're both important disciplines, and often useful even if you're working in the other. However, they're distinct enough that conflating them seems odd to me. Similarly, the database, software engineering, and embedded communities have grown enough apart from systems and PL that not recognizing how principles differ may also be a disservice. I am all for cherry-picking approaches from all of the above.

By lmeyerov at Thu, 2010-01-07 00:34 | login or register to post comments

Bottom Line

A dependency structure matrix can be used to highlight systemic problems in how a program manages and uses dependencies. It can help guide refactorings, rethink memory allocation and object factory design, improve control partitioning in a distributed system, etc. [Edit: Most large frameworks invent or call for the need for a DSEL / DSEC to provide first-class abstractions for the concepts they want API end-users to use. WPF with its dependency subsystem is no different. Howeer, through looking at WPF, we can clearly see that in defining an API as a DSL there is many flaws that with knowledge of PLT you might want to avoid! DSM is a very simple "guess and check" method, although hopefully you are not doing TOO much guessing.]

It is especially useful when you have a shifting problem domain. For the problem of identifying how to fit a bunch of VM code targeted for a specific environment, understanding what dependencies you have are useful in the implementation of the system your language rests on.

You are right that this is about API design. I am saying API design is not that different from language design.

Actually, for progressively more declarative languages, it will be interesting to see how compiler writers handle environments that use language extensions and compiler extensions, selectively. Languages do/can have dependencies! A Kitchen Sink environment is a good example of this. Haskell's core compiler, GHC, has to regression test optimizations to make sure they don't slow down other kinds of optimizations. This is the case for traditional compilers, too, but I suspect it will be a much greater problem for languages as we approach "automatic programming".

By Z-Bo at Thu, 2010-01-07 01:59 | login or register to post comments

Browse archives

Active forum topics

New forum topics

Long Time No See
17 weeks 11 hours ago
Long Time No See
17 weeks 11 hours ago
Long Time No See
17 weeks 11 hours ago
Prefix languages without
23 weeks 21 hours ago
Ah, cancel that request.
1 year 11 weeks ago
.
1 year 11 weeks ago
First-class link?
1 year 11 weeks ago
Video Presentation
1 year 33 weeks ago
Also published in ICFP
1 year 37 weeks ago
About identifiers...
1 year 39 weeks ago

Lambda the Ultimate

User login

Navigation