Links

You can see slides from the Links meeting here and commentary and pictures here. (Thanks to Ethan Aubin for already starting a thread under the former, and to Ehud Lamm for inviting me to guest blog.)

Ethan Aubin writes:

So why do we need a new language? What cannot be accomplished with existing frameworks? There is a slide following this asking why can't you do this in Haskell or ML, but I don't know why they (or even java/php/etc) aren't enough.
Let me try to answer this. Links is aimed at doing certain specific things.
  • (a) Generate optimized SQL and XQuery directly from the source code -- you don't need to learn another language, or work out how to partition your program across the three tiers. This idea is stolen from Kleisli. You need to build it into the compiler, so Haskell, Ocaml, or PLT Scheme won't do.
  • (b) Generate Javascript to run in the browser directly from the source code -- you don't need to learn another language, or work out how to partition your program across the three tiers. We're hoping to provide a better way to write AJAX style programs. Again, you need compiler support -- vanilla Haskell, Ocaml, or PLT Scheme can't do this.
  • (c) Generate scalable web interfaces -- where scalable means all state is kept in the client (in hidden fields), not in the server. PLT Scheme and WASH provide the right interface, but they are not scalable in this sense, precisely because making them scalable involves fiddling with the compiler. (Felleisen and company have pointed out the right way to do this, applying the well-known CPS and defunctionalization transformations.)
So that's my argument for a new language.

Is it a good enough argument? Is this enough of an advantage to get folk to move from PHP, Perl, Python? Not clear. I suspect if it is good enough, a major motivating factor is not going to be anything deep, but simply the fact that being able to write everything down in one language instead of three or four will make people's brains hurt less.

Ethan Aubin also writes:

Wadler goes into the FP success stories, Kleisli, Xduce, PLT Scheme (Continuations on the Web), Erlang. If you take the befenits of these individually, you've got a language which solves the 3-tier problem better than what we have now, but I don't think it meet the criteria of "permitting its users to do something that cannot be done in any other way". So, I'd like to ask the all the perl/php/asp/pythonistas on LtU, what it is the killer-app that that your language cannot handle?
I'd love to see answers to this question!

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Language vs. Compiler?

You need to build it into the compiler, so Haskell, Ocaml, or PLT Scheme won't do.

It's a proof that specific implementations (compilers) are not good. This does not mean trivially that these languages are not good (for this "extraction" of SQL/JS out of the Haskell, Ocaml, Scheme code).

I believe there are some arguments for this conclusion, I just don't see them on the surface (of the presentation). Are type systems of these languages bad for this extraction to be done efficiently? Is their semantics bad for that? Their abstract syntax? Concrete syntax?

User base? :-)

It would be very interesting to learn your opinion on this!

Compiler support?

You need to build it into the compiler, so Haskell, Ocaml, or PLT Scheme won't do.
It's a proof that specific implementations (compilers) are not good. This does not mean trivially that these languages are not good (for this "extraction" of SQL/JS out of the Haskell, Ocaml, Scheme code).

Also, a nit: the emphasis on needing compiler support for these things seems to be predicated on not having macros. In fact, Scheme and Lisp can do at least (a) and (b) above in an integrated fashion. SchemeQL is a good example of this.

Although I'm fairly sure it wasn't his intent, Philip is in fact making a great argument for the necessity of macros. :) This is particularly true if there's any interest in delegating significant tool-creation capability beyond the designers of a language.

Of course, the argument can be made that macros make Todd mad, and that most programmers want more syntactic sugar than a regular syntax offers, which means that compiler support is in fact wanted to satisfy this perception of market requirements.

[P.S. I'm not arguing against the potential value of a language like Links.]

Language Problems?

While I was initially enthusiastic about the Links project, I'm starting to have my doubts.

In particular, I don't think that some of the things you are trying to solve are really language issues.

Generate optimized SQL and XQuery directly from the source code
Generate Javascript to run in the browser directly from the source code

The big problems with both of these things is heterogeneity of environments. Depending on what specific RDBMS or browser you are running, "optimized" means something slightly different.

In particular, Javascript tends to run slightly differently on every browser/platform combo. I think the hype about AJAX has glossed that over. I suspect that Google just has enough bodies to support Javascript libraries for all the combos they want to support.

I'm not sure that making Links into a big multi-platform code-generator will really solve the complexities here.

where scalable means all state is kept in the client

One word here: security. If the client has the definitive view of the universe, the server is vulnerable to manipulation by the client. There will always have to be SOME state at the server side.

Maybe you can off-load a lot of "state" to the client, but you still may open up performance issues, privacy issues, client-side security issues, etc.

Again, I'm not sure that these are language issues per se, but architecture issues inherent in web development.

SQL Generation

HaskellDB doesn't yet, but could and probably will one day, generate optimized SQL for different environments. It abstracts away from SQL by acting as a compiler for an embedded relational query language. I imagine that something similar could be done with XQuery and Javascript - make the different platforms different targets of a retargetable compiler.

generate optimized SQL for different environments

From my experience, you will always need a way to get at the raw SQL generated and tweak that. Databases are strange beasts, and optimizers change query path constantly. What used to be an optimized query suddenly runs slowly. Even as table sizes change, the optimizer changes the query path. I have found just re-arranging the order of statement elements radically (order of magnitude) changes performance (and of course it should not - but optimizers are written by mere mortals).

To Say Nothing...

...of how a Links is supposed to know when it would be appropriate to build some indices, and on what. I suppose that, in the limit, it could periodically do an "EXPLAIN" and attempt to do some inference. That would be very interesting to see.

Wrong level problem

it could periodically do an "EXPLAIN" and attempt to do some inference

I think this underlines my qualms with the idea: in order to make the SPECIFICATION (language design) work, we are already pretty deep into IMPLEMENTATION issues.

This suggests the problems don't live at the language design level.

Yes and No...

...No, because a language that has the explicit goal of interoperating productively with existing "relational" databases has no choice but to finesse the interaction with SQL somehow.

Yes, because it's not clear that this "finessing" can't consist of having the usual kinds of support for extensible control-flow coupled with a good syntactic extension mechanism. Ehud recently reminded us of SchemeQL, which seems like an excellent example of the latter. I'm also reminded of Oleg's work on database interfacing in Scheme.

Permeable abstractions

From my experience, you will always need a way to get at the raw SQL generated and tweak that

It's sometimes necessary to get at the output of a C compiler and tweak that, too. Sometimes, but rarely. I think a relational query language compiler targeting SQL could get it right enough of the time to be useful. But I agree that it's important to be able to access the lower-level representation from time to time.

but rarely

Obviously my exepriences are not "statistically significant". I have, however, been using Oracle, SQLServer and others for over 15 years. Rarely is not the term from my experiences. Regularly might be the term I would use.

Why, for example, do databases offer "hints" as a SQL facility? Because they know the issue is significant.

Why databases offer "hints"

Because they're committed to supporting a crude language in which trivia such as order of operands are significant, and are used to determine performance. A better approach would be to take the existing core engine and mate it to a more sophisticated language, one whose interpreter would do performance analysis on the fly to respond to the changing conditions in the core.

I've written a compiler targeting SQL, and getting the generated code to perform was indeed the hard part. I had serious DB experts advising me on how to improve the output, but the problem was generalizing that advice. The improvement added to make customer A's code perform later turned out to hurt customer B; and the information to distinguish the two cases just wasn't available at compile time. I had a design laid out to let runtime performance data feed back to compile time, but never got a chance to build it before the company imploded.

Costs/benefits model?

Why databases offer "hints"
IMO, because neither databases nor computer systems in general are aware of real-life costs. They might optimize against costs model that is obvious to them - better response time, better throughput, better memory usage. Unless told by humans, they do not know, whether the enterprise gets more bucks out of insert or select operations. This knowlegde is not obtainable by any degree of introspection.

Thus it has to be specified externally.

One problem with current state of affairs is that this information must be provided in wrong form by wrong people at wrong time (SQL hints by developers at time when a statement is prepared). One cannot easily decouple this costs/benefits information from SQL statements in both space and time.

Probably, Links will provide a separate life cycle for costs/benefits specifications.

Database Optimizers that Learn

Perhaps what we need are database query optimizers that learn?

Optimizers often produce query paths that are nonesensical, taking one or more orders of magnitude to do the query than is necessary. The current solution is either hints or humans re-arrange the queries.

If we had query optimizers that learned, we could either:
* Tell the optimizer to try again (offline - while tuning)
* Have the optimizer use idle time to experiment with the last periods queries to find better paths, and remember

And are not dogmatic

I've been having problems on the stored proc side where the compiler will figure out the plan based on the current inputs. Once compiled, the proc doesn't need to be recompiled until the conditions change or a recompile is needed.

Problem is, that the parameters used in the first call may or may not be representative of the data as a whole. Means that you can get a very small dataset which might perform a microsecond better doing one path, but that same path can take minutes to complete for the larger sets.

Haven't really figured a good way around the problem other than dropping hints and trying to force a compile on the worst case datasets.

Nobody is teaching them

Database Optimizers that Learn
To learn, they must know about utility, and that's something they don't know.

Even if we forget about economical values: DBMS can optimize locally - in the scope of the database, but not in the scope of the whole system, because nobody tells them about outcomes at the level of the whole system.

optimize locally

Even a "local" optimization is better than none?

A query that should take under a second with a good plan often will take several minutes. This one requires little knowledge of context.

Partially agree

Even a "local" optimization is better than none?
Yes, if it is a win-win. No, if it is a trade-off (as optimizations quite often are).
A query that should take under a second with a good plan often will take several minutes. This one requires little knowledge of context.
Now imagine that this query is executed by night, when nobody cares about its response time. Additionally, imagine that to ensure this query runs under a second we have to degrade performance of some other query by 10%. If this other query happens to have less "elastic demand", doing this optimization might be a mistake.

Granted, "with a good plan" does not necessarily means harming "utility" for other queries - it may be a win-win. But a classical DB dilemma - to index or not to index - is an example of a trade-off decision, which cannot be meaningfully taken locally.

State in Client

For security, you could encrypt the state held in the client?

The only secure way for the c

The only secure way for the client to modify the state then would be to have it pass to the server for modification. Hardly a performance win.

Proof-Carrying Execution

The client could always modify state locally and pass the a trace of execution back to the server for verification. If the server detects an something is off the client would have to roll-back the changes made.

Security blanket

If the server detects an something is off the client would have to roll-back the changes made.

What if the change being made is to upgrade the security access of the client?

This does not require the sta

This does not require the state to be stored in encrypted format on the client; in fact it requires the state to available in clear.

While less egregious, this is still egregious. What benefit does this scheme provide over having the master copy of the state on the server?

Scalability

Scalability. If the session state lives on the server, then you have to either store it in the DB or tie the client's session to a single Web server. If it lives in the client, then it comes back from the client on each request, and any Web server can handle the request--no application server needed. (Plus, the Back button doesn't break the app, since going Back unwinds the session state.)

I've built a calendar app like this. Simulated load showed one million users on something like $10,000 worth of hardware.

Typing

I think the thing the excites me most about recent "practical" language design ideas, and the Links presentations is in the area of rich type systems, specifically types for XML.

Integrating XML processing into the language, and ensuring compatability between the program and the XML model via types seems useful, doable, and cool.

XML is a distraction

Efforts to directly support XML types in a programming language has definitely produced cool results, many of which will find new applications. However, is the original goal misguided?

I feel that the reason XML is difficult to deal with in conventional languages is because it is a bad format. Regular expression-based typing, while "neat", results in confusing data structures. They're great for matching data, but not so great for structuring that data. (That's why a parse tree only used for matching. Once the pattern has been matched, the data is put into an AST, which is usually a more conventional data structure).

The standard list/tuple/variant data types of Haskell and ML are much better than XML.

Are we doing the wrong things?

I feel that programming language designers are most of the time found solving the wrong problems and i feel that this is the reason why we have 98% of the languages about which no one cares about. Of the resi 2% only a handful are known in the commercial world.
(I work for one of the largest financial companies in the US and many of them do not know anything exists beside perl/c/c++/java and sybase,even though we have some DSL's which ofcourse no one uses.They solve the wrong problem anyway)

I think that the goal of the programming languages should be to enable programmers build software faster and in a better way. Though you might argue that allowing people to write program in a few lines of code with nifty concepts of languages, it is not the solution because people have other things to do than learning languages.

I think that the new languages should concentrate on enabling software visualisation, visual debugging and perhaps a new paradigm for thinking about programs. These features solve real problems of maintaing software systems. Although your code may be twice as large, if it is easy to visualise then that may be the way to go. There can be languages which are simple to learn yet very powerful like Python and Ruby. Also if you think that the field of visualisation is dead, then it means that those all methods were wrong.

About learning new languages

You argue that programmers have better things to do than learning new language. I am currently involved in the development of applications similar to Ajax-style client-server model. To do this, I had to learn
  • JavaScript as a real programming language (not just for simple web pages but for serious development, which is quite different).
  • The XUL XML dialect, for client-side user interface description (and inspiration for Microsoft's upcoming Xaml).
  • The XBL XML dialect, for XML object-oriented (yet mostly untyped) programming.
  • Mozilla's XPCom component architecture (C++-based, dynamically yet manually typed, with half-automatic half-manual reference counting, without exceptions or return values).
  • And a few other things such as XmlHttpRequest's arcane usage, Mozilla's half-documented XPCom libraries, several specialized XML-RDF dialects...
In my opinion, this counts as learning 4 languages if not more.

In addition to this, in my experience, XBL and JavaScript programs typically require at least 3 times more lines than OCaml counterparts, while XPCom-based programs require 4 to 5 times more lines. As you say, the number of line is not really the matter. However, most of these lines are not just syntactic sugar but opportunities for Segmentation faults and silent failures. Counting from all the errors I commit or see discussed, I would say that most of the errors should be handled by the language. Indeed, at least 95% of these errors would be caught by the compiler in OCaml or Haskell (for the components) and in XDuce or CDuce (for everything Xml) [1] and at least 50% of them would actually be syntactically impossible, by language design [2].

Bottom line: I believe that there is a place for a good web client-server DSL. Perhaps, as many of the good languages available, it will not be used. But I believe it would be worth learning and using. And it does not have to be more complex than Python or Ruby [3].

Heck, I would even volunteer to contribute to such a language, at the expense of some of the time I spend on my own projects.

  1. Note that I have not tried XDuce and CDuce and am speaking from mere readings of the manual.
  2. Statistics might be different in more algorithmically-challenging projects.
  3. I find that Python and Objective Caml are as easy to learn, with the difference that Objective Caml error messages are sometimes more obscure but also more useful in the long run. I have never learnt Ruby. Heck, I'm supposed to be a theoretician :)

Links

(a) Generate optimized SQL and XQuery directly from the source code -- you don't need to learn another language, or work out how to partition your program across the three tiers. This idea is stolen from Kleisli. You need to build it into the compiler, so Haskell, Ocaml, or PLT Scheme won't do.

This is an advantage, but I'm not sure that one big language is such a big sell over X smaller languages. In other words, yes it can be done (even though its a pain).

(b) Generate Javascript to run in the browser directly from the source code -- you don't need to learn another language, or work out how to partition your program across the three tiers. We're hoping to provide a better way to write AJAX style programs. Again, you need compiler support -- vanilla Haskell, Ocaml, or PLT Scheme can't do this.

Javascript is a good temporary target, but I hope the architecture keeps the target interface abstract. I.e. if I write a Links application the I hope that it's as easy to create an interface for your cell phone, your a native windows desktop, and your home X10 home appliance network as it is for your browser.

Where does a Links program exist in the three tier model? Do you envision Links to be used on the server side to define the logic of a distributed application and to compile a user interfaces which are sent to the client?

Links could get tons of publicity and support if it integrates with Firefox and Thunderbird out of the box. Esp. if Links can outscript javascript. E.g. maybe handling DOM events with functional reactive programming. Or ensuring the password I enter on the LtU login does not flow to some malicious site. (This isn't just hypothetical, ISearch, a windows spyware program, does periodically hijack Firefox on my home machine).

Cell phones might be a good 'choke-point'. Everyone has them, they can be programmed (but not easily). Companies haven't yet figured out how to exploit them and best of all you've got distributed and unreliable network which you can showcase the difficulties of conventional programming and the ease of Links development.

Links meets Mozilla

Javascript is a good temporary target, but I hope the architecture keeps the target interface abstract. I.e. if I write a Links application the I hope that it's as easy to create an interface for your cell phone, your a native windows desktop, and your home X10 home appliance network as it is for your browser.

So do I. In addition, there is the upcoming war between Html-based, Mozilla's Xul-basedand Microsoft's Xaml-based user interfaces for web/desktop applications.

Links could get tons of publicity and support if it integrates with Firefox and Thunderbird out of the box.

I concur.

one possible right thing

The recent proliferation of languages has been remarkable...to my mind (but possibly not for others) there's really a lot of interesting stuff going on.

Things get used when there are good reasons to use them, and they play nice with what you already have. In the case of wider-world programming, there's a crapload of stuff we already have. Who wants to write an FTP handler? Not me. I'll just download one written in Java, and away I go.

On the Java VM I can pick from dozens of languages, and they all target the common platform. I think that's great, and I probably won't find too many people who disagree.

When I look at compilers, I see just about every compiler group rewriting the same stuff, over and over. Tokenize, parse, assemble AST, build symbol tables, import symbols and type info from the outside world, construct name tables, perform a series of transformation passes over the AST, translate to output form, emit code.

So why isn't this stuff generalized into a framework? I guess everybody wants to prove their language by bootstrapping it, or something like that ;)

I am thinking about a slow, slow compiler that does not do much, but is interesting anyway. Once I get tokenizing out of the way, I can put build my AST in something somewhat standardized like, say, an XML DOM tree. Yuck, you say, and I agree, but it's good enough for attributed trees and the whole point is to be able to talk about trees in a standardized way. It's also good to be able to transform them. In the slow slow compiler you could build a transformation phase with XSLT. You could pattern match with Scala, or write a spiffy Java program that did everything the hard way. If that's what you wanted to do.

You could also transform it all into Scheme and play with it as S expressions.

So we have trees, with characteristics and shapes. We want to be able to get from one form to another. If I have a super spiffy idea for direct-random-access-pattern-match-compilation, wouldn't it be great to be able to just code that phase in a standard sort of way, specifying the characteristics and shape of the tree needed as input?

It starts to sound like a job for a reasoning engine. My goal is to create executable bytecodes. Those need to be flat. My flattener needs can take trees, but they can't have inner classes. My class definitions need to have type erasure performed on them. Type erasure needs type analysis.

Any such system needs fixtures upon which to operate; familiar patterns to gain traction. Finding simplified common representations for different type systems, for functions, for class/method/procedure/rule/pattern/coroutine, seems pretty important to me.

Bits and pieces of this exist in places like gcc, but who the hell wants to start with that, and have to build phases in C? Yuck. If I can losslessly transform an Abstract-AST out to a representation in a language of my choice, I can be having super quality happy time writing my lambda lifter in Perl. Or smashing my fingers with hammers, which might be more fun.

These thoughts came about when I was comparing the java 5 compiler with Scala; the two of them are more alike than they are different. There's a pile of infrastructure that's the same, but different. Scala's AST and symbol handling is a little twistier, but so what. Scala does get to do cool pattern matching on its own AST (or in Pico before that). That seems pretty desirable to me, so a nice framework for doing this stuff would have a pattern matching capability. And bytecode output for your favorite VM. And... :)

We need a VM for compilers. Or maybe bootstrapping the universe is just fun and worth doing over and over again ;)

Bootstrapping the universe

Some of the problems you address--scanning and parsing--are pretty much done with standardized tools. Others, of course, aren't--in part because there isn't agreement on how ASTs for example should be structured; or how to deal with semantic anaylsis (attribute grammars?).

Machismo may be part of it, too. There is a longstanding prejudice (justified or not) that a (domain-independent) language isn't worth talking about unless one can write a compiler for said language in itself; guess what is often written first? Obviously, you can't use existing tools if you're bootstrapping a new language.

That said, toolkits like you mention do exist. If anything, there are too many of them...depending on what host language you choose to use.

yeah but

I think host language and platform matter quite a bit, and a good deal of effort goes into building that stuff.

Any real-world compiler needs to be able to do things like crack apart .class files, or introspect a .net assembly, or pull apart a .lib file. At the end of the day you need to be able to flatten, linearize, emit bytecodes, assembly, whatever.

The middle-most parts of a compiler can get specific to the language and have a fair amount of complexity. Those outer parts, though, are pretty generic. Just about every compilation-based language for the Java VM needs to implement classpath reading, cracking, lazy instantiation, and so forth...

Choosing a host language is really part of the problem -- phases are necessarily written in one or more languages, but a compiler as a whole need not be homogenous.

Imagine a "compiler" that did a tokenize/parse, then wrote the results down into an XML (or whatever structured representation) file. Another program then performs the next phase and writes its results...just like in the very most ancient days of compilers. If we have a well-defined basis for representation phase logic can come from many places.

Scanning and parsing just isn't as hard as it used to be; plenty to help us there.

Ultimately this is about making software assembly easy, or even automatic. There's not much different between backwards-chaining compiler/transform assembly and performing the same task in other domains (pulling together software transformations written in multiple languages and models).

The truth is out there.

The MetaEnvironment includes many tools and interfaces with even more:

  • stratego - a tree transformation language
  • ASF - A term rewriting language with interpreter and compiler.
  • SDF2 - a parser definition language
  • ATerm - a general and efficient abstract syntax tree representation

MetaEnvironment also includes ToolBus for communication among components, TIDE as a generic debugger, and more.

There's CTK, the Compiler Toolkit, and CTKlight for lightweight applications. (CTK's fast lazy lexers are particularly nifty.)

GCC-XML dumps intermediate structures to XML. David Himmelstrup is using that in Hacanon for example.

There are lots more language/compiler/interpreter tools and frameworks available, these are just the ones I can think of off the top of my head.


--Shae Erisson - ScannedInAvian.com

your google is better than my google

so I will start looking at what you've suggested! thanks for the pointers.

build woes

(Now if anyone can get the MetaEnvironment to build in Windows, please let me know how you did it. I've been wanting to experiment with their software for a long time now, and have repeatedly been thwarted in my attempts to build for Cygwin.)

ouch

I've gotten as far as mostly getting graphviz to build -- there's a few extra packages you have to get installed into cygwin (curl, gd, and I can't remember the other ones).

Sure wish there was an ebuild for Gentoo. ;)

In any case, thanks to Shae for the pointers. My head is now on the verge of detonation from reading all these papers.

Debian packages for MetaEnvironment

There are official Debian packages for most of the MetaEnvironment, you may want to try those if you have the option.

--Shae Erisson - ScannedInAvian.com

Stratego/XT in Gentoo

I'm the maintainer of the Stratego/XT packages in Gentoo. We have the entire XT environment available, and it is usually fairly up to date.

We don't have the MetaEnvironment, but if people are persuasive, I can add it. Just open a bug at bugs.gentoo.org and assign it to me, karltk@gentoo.org, then I'll see what I can do for you.

Embedding of languages

We are using Stratego/XT to embed XPath in Java using java-front. So there's no need to create a new language (since all the DSL's are already in place). Stratego has one bad thing: it's error reporting. Not considering this, it's pretty good.

On SDF, ASF, and Stratego

I'm one of the developers of Stratego, so maybe I can help a bit. First, Stratego is not part of the MetaEnvironment. Let me explain the structure of all this work a bit.

First, there is SDF. SDF is a modular language for syntax definition. SDF simply rocks and kicks the ass of any language for syntax definition I know about (sorry to be informal and subjective here, I reserve formalities for our publications). SDF integrates lexical and context-free syntax and can supports arbitrary context-free grammars, as opposed to subsets, such as LL, LR or LALR. All these subsets cannot be modular, because the are not closed under union. Our Concrete Syntax for Objects paper at last year's OOPSLA explains why modular syntax definition and scannerless parsing can be very important for some projects (e.g. embedded DSLs). SDF is easy to install. At syntax-definition.org we provide binary distributions for most platforms (RPM, Cygwin, Mac OS X, sorry no native windows yet.) of all the tools that you need to use SDF.

Second, there is ASF+SDF, which is a first-order term rewriting language that employs SDF to allow rewrite rules to be written in concrete syntax. The ASF+SDF MetaEnvironment could be described as an IDE for ASF+SDF. I'm not a maintainer of ASF+SDF, so I don't know much about the available distributions.

Third, there is Stratego/XT. Stratego is a program transformation language and XT is a set of tools for implementing complete program transformation systems. Stratego/XT also uses SDF for syntax definitions, in particular for providing support for using the concrete syntax of the object language in your meta programs. Stratego/XT is a plain set of command-line tools (say, SDK): compiler, interpreter, etc. There is no full-featured IDE available, just some syntax highlighting stuff for Emacs and Nedit. Stratego/XT distributions are available in source are binary form for RPM distributions, Cygwin and Mac OS X. Please don't use the Debian distributions: these have not been created by us and don't seem to be maintained. The available version is completely out of date.

I hope this makes things a bit more clear and I hope you enjoy these tools if you start using them. Stratego and ASF+SDF are rather domain specific, but SDF and Scannerless Generalized-LR can be applied in a lot of different projects, particularly if your parsing needs are getting to complex or you need to prototype some language.

Sorry that might sound like a commercial break, but shapr referred to our tools and I want to make it as clear as possible to new users what they should download :) .

commercials

That's perfectly alright, Martin. I am sure many here are interested in your work. I, for one, am always glad to hear more about it.

Thanks for the clarifications

Thanks for the clarifications -- I am poring over SDF, and your note indicates I should be looking heavily at the Stratego/XT toolset as well.

Interesting

You may find this http://www9.org/w9cdrom/342/342.html of interest.

SmartTools

I remember seeing a presentation on SmartTools, which provides much of what you are asking for. It allows the description of a new DSL, and generates both the IDE and the compiler for it. The main target is an AST on which visitors can apply the transformations of your choice.

Parse tree munching with XSLT

Once I get tokenizing out of the way, I can put build my AST in something somewhat standardized like, say, an XML DOM tree. Yuck, you say, and I agree, but it's good enough for attributed trees and the whole point is to be able to talk about trees in a standardized way. It's also good to be able to transform them. In the slow slow compiler you could build a transformation phase with XSLT.
I suggested this route to students in last year's compiler lab. It worked quite well actually:
source code -SableCC parser->
parse tree  -SableCC->
XML dump of parse tree -XSLT->
AST in XML representation -DTD->
Java Class representation of AST & validity check(!)

Manual steps: writing the grammar, tree transformation rules (XSLT), and DTD, everything else was generated.

Using XSLT instead of Java for tree tranformations was a big win (declarative, faster to write, more concise, easily changed later on in the project). Also, the DTD is a spec for the AST, allowing other tools to rely on its structure.

Of course, some lispy language would've done just as well, and probably in a more concise way than XML/XSLT, but there you go...

LtU and Submarine Soundness

Anton and Ehud, was there web development gripes you had setting up the new LtU? Would integrated XML/SQL have made your lives significantly easier setting up and maintaining this site?

Philip, What criteria would you judge the success of Links? The ease of setting up a 3-tier site (like LtU or Slashdot)? The innovation of new applications? The acceptance in industry?

Per language acceptance, Perl might be the best language to follow. A huge selection of librarys which are easy installation, a standardized build process. Also crucially important is the documentation. For Links to be pervasive, there ought to be something comparable to the Perl cookbook online. I'm more convinced that the pragmatic programmer doesn't give a damn about the theory of a language. Soundness, semantics, are not of interest to people not developing the compiler and language tools, so I wouldn't even mention it in the marketing.

Some simple applications should be written just for education. I suggest a Wiki and an IRC client as every language under the sun has implementations to which the novice can compare.

A challenge for everyone: What are the web-services you would like to write? Pie in the sky is good. I can think of:

  • Annotation of arbitrary web pages. - Take the idea of the Annotated Times and apply it more broadly. E.g. The server directs the client to download a page from the times, and annotations from the bloggers of your choice. This is probably pretty easy, what would be more interesting?

some applications

Some example applications that I'd like to see:

  • A wiki engine. Better yet, several experimental wiki engines with different design trade-offs. If the language is powerful enough that I can easily whip off different prototype frameworks for some particular web application, then I might say, "that's something I can't do with any other technology."
  • A blogging engine. Same deal.
  • Forget meta-circular compilers, I want to see a meta-circular LinksDoc tool for generating API documentation web pages!
  • A homework submission server.
  • CiteULike.

Re: LtU

Anton and Ehud, was there web development gripes you had setting up the new LtU? Would integrated XML/SQL have made your lives significantly easier setting up and maintaining this site?

It wouldn't have helped directly, because we didn't do much of that work — we used an off-the-shelf tool, Drupal. Drupal is a pretty well-designed system — its founder, Dries Buytaert, is a CS PhD student. However, it's implemented in PHP, and I'd say the biggest single problem with it is that customization tends to require directly modifying the relevant code, i.e. beyond Drupal's web admin interface, it's not nearly as heavily driven by templates and config files as some such systems are. That was apparently a conscious design choice — there's been some debate in the PHP community about which approach is preferable. One of the arguments against the more data-driven approach boils down to a pragmatic attitude which might be described as premature abstraction being evil, i.e. don't try to anticipate too much how users might want to configure the system, but make the code easy to modify.

If there were a more powerful language which was popular enough to have supported the development of a highly capable tool like Drupal, we might have benefited from improved abstraction capabilities. OTOH, I've noticed that one barely has to know PHP to modify a program which uses it — superficially at least, it's perhaps one of the least surprising languages I've come across (assuming one has a certain familiarity with the usual mainstream languages). How much of PHP's success has been driven by this effect? A more "advanced" language, especially one with a modern static type system, might have difficulty replicating that quality. For development of systems in which a major part of their value resides in the collective effort that's gone into filling in details, that ease of use is likely to be a major success factor.

I'm not sure that has to be a problem for something like Links, though. I wouldn't have thought Links would really want to replace PHP — there's more interesting fruit a little higher up the tree, closer to the region in which Java and C# play. (Cue inter-language flamewar — stereotypes used for communication purposes only.)

Growing procedural languages

Is there a reason that these features cannot be added to
languages like python, ruby or perl? In terms of runtime and compiler development how costly would it be to implement similar features in more popular languages?

Donning REST Roundhead helmet...

The client transfers state to and from the server via hypermedia representations. The state of server-side resources is owned and managed by the server. But it is up to the client to keep track of the state of its current interaction with the server (its current location in the state diagram that would describe that interaction).

In the case of values "remembered" between one browser page-load and the next, it's a question of the browser POST-ing those values to update a server-side resource, and getting them back again as part of the server's response to that POST. This isn't so much "client-side state", as state threaded through the client's conversation with the server by each party continually echoing it back to the other. (Note that this is an approach to state-persistence peculiar to browsers: clients that are applications can remember things perfectly well for themselves, thank you).

In this scenario, either party (or a devious intermediary) is at liberty to modify the contents of the "echoed" state. The server must therefore never accept anything that comes from the client as a reliable echo of something authoritative that the server said previously, unless there is some additional means of verification.

Continuations

I take it that what's meant here by keeping state in the client is that continuations are serialized and then echoed back and forth (with the server issuing a new continuation-representation each time the computation advances). See Anton von Straaten's Continuations continued: the REST of the computation for a discussion of some of the issues relating to this approach.

Motivations

Web Programming definitely presents some rather difficult issues, and it is good that someone is looking at the tasks involve from a wholistic perspective.

In several ways I don't believe Links is the answer

  1. Javascript programming is hard not because of the language, but because of the DOM and event models. The purpose of client javascript is mainly to update/redraw the UI elements. With the fragmentation of the browser marketshare (MS, Firefox, Safari, mobile devices), the problem is getting harder, not easier. At this level, it is easier for a developer to debug Javascript directly, rather than debug generated code, which might be run on another interpreter on top of the JS interpreter. The number of talented people who build open-source cross-browser libraries can be counted with the fingers on my hands, and the last thing a language designer wants is to make it even harder to reuse one of these libaries.
  2. On the RBDMS side, Query languages are easy, but Update languages are hard, if one has to consider concurrency, deadlocks (updating tables in the same order), transactions, locking, autonumbering. Most database support these to varying degrees. There is no substitute to working directly with SQL. Object Relational mapping and Relational Object mapping enjoy had some limited success in this area.
  3. Scalable, stateful clients have already been explored in detail by ASP.NET's page state. They are in their second iteration now with 2.0, and offers a plethora of options (encryption, binary mode, centralized on one state server), and is not in itself interesting to study anymore.
  4. I haven't seen the specifics of ASP.NET 2.0's AJAX capability, but the existing eventing mechanism is already quite nice (Incidentally, Java Server Faces is more or less the same), although the run-time is fairly useless as far as making debugging productive, there is no REPL mode at all. I recommend one studies it closely before reinventing another one.

If we look at successful web frameworks, most of them succeed at being able to partition a framework into smaller DSL's which allow work to be effectively partitioned and specialized. Templating languages for web designers, dynamic languages for developers, DBAs who tune the SQL statements using various hinting tricks. In ASP.NET, a developer can purchase a myriad of third party UI components which draw slick looking menus or other widgets with HTML. This is why we probably shouldn't be looking for the one-true-language, but rather partition the problem domain so that each person can be most productive in their environment.

I believe the kind of programming problems faced today is not the coding kind, but of the understand-and-modify kind. Most of my daily work involve figuring how an existing piece of code worked, and modifying it without breaking other pieces. Better runtimes, for instance, one that would enable me to trace how a variable arrived at a certain value, and to retry with minor code modification without breaking my rhythm is more useful. To this end, I think SmallTalk got it right.

The hardest and hairiest part of web programming now is dealing with DHTML, a bar that has been recently raised by Google Maps and Google Suggest, and the other AJAX programs that has come online. I'd be willing to invest time to learn a new language if:

  • It means I no longer have to write any DHTMl and sniff for browser implementations
  • It will interface well with other languages. For instance, we have about 10 man-years of code written to run on Zope, and we can't afford to lose this.

Web Progamming Language

I agree there is room for a new language in the Web area.
Current solutions are :

  • PHP / Perl : dynamicly typed programming languages. It's possible to build big applications with them but they tend to break easily. More important : they're very slow, and cost a lot of server CPU
  • Java : using JSP or Java Server Application have some good points : you learn how to think your website in terms of application, it helps to structure your program, and give you some typing. However the Java type system an VM are far from flexible, and so you need to go through a lot of high-level wrappers generated in order to hide all the SQL/XML/... job done behind. Additional typing turns then into a pain, and all theses technologies together are too much heavyweight for small to medium websites.
  • Ruby/Rails , Python+Zope : didn't investigate theses ones so much, they are also dynamic and not famous for performances

That's why I end-up developing my own platform. I had already a fast and lightweight virtual machine, a compiler for a small dynamicly typed scripting language (MotionScript), and a high level functional+OO language with type inference (MotionTypes) which we use to develop Flash contents. All I had to do was to write a MotionTypes -> MotionScript code generator, to put the VirtualMachine into a "mod" for Apache and to add the binding for MySQL, some system commands, and some access to Apache API. And it's now running very well.




I think that the following approachs will be successful :

  • focus on "website-as-application", not as a bunch of scripts : build a single binary for deployment
  • have a two-steps compilation approach : from one high level language to a low level one, then to bytecode/JIT/native, this will ease 3rd party development for additionnal tools/DSL in your web framework
  • don't try to solve all the problems at the same time. First focus on some specific point (for me it was typing and performances) and then add features after it has proven usable. My next target is database integration, but I should keep a low-level SQL bindings since some users might not want to do the "Big Jump" and learn all theses new technologies at one time.

I somehow agree with Xavier Leroy slides. In general, language designer are theorists and should focus on languages where the theory is important. A web programming language is more about standards, apis, and technology. It might be more successful in handled by engineer people (no offense to theorists people, they make most of the time very good engineers too).




PS : it's a bit OT but I'm working recently on a compiler framework such as proposed in this thread, interested people can contact me at ncannasse at motion-twin dot com.

You seem to have left out the

You seem to have left out the scheme/lisp/smalltalk solutions.

multitier solutions

Given this thread I wondered about the following: are you answering the right questions vs. are you answering the questions right (?) - and - are you solving the right problem vs. are you solving the problem right?

Dunno, probably, anything making multitier development easier seems worth investigating but, given recent threads, would a framework for multi-DSL development not be a better answer? (I am thinking at syntactical solutions, not VM models)

And, if multi-language development is a problem: why then didn't the javascript server/client combination become popular? Javascript -based on a lisp tradition- is a nice functional language. Maybe it's just because they forgot to implement ADTs and pattern matching?

Can it be that it would be equally worthwhile just to showcase how powerful javascript actually is if some syntactic sugar would be added?

Just my (late) $0.02 opinion.

For proprietary frameworks?

why then didn't the javascript server/client combination become popular?
I witnessed a system based on this combination to die before going to production. JS was lacking infrastructure (in those days, not sure about today), so the developers were forced to implement infrastructure in C++ (I mean networking, clustering, metadata, database access, multitasking etc.).

They still had a benefit of the same language for business logic on both client and server, and that was indeed good. OTOH, a boundary between business logic and infrastructure became socially inpenetrable - BTW, this is not different from EJBs, where a developer of a bean cannot go and hack the container. For a proprietary frameworks, though, this property was not economically attractive (the whole point of keeping a proprietary framework is flexibility).

So this framework was replaced by Java-based one.

Some people work very hard / but still they never get it right

Well, I'm starting to see where this web development stuff is going. It's funny, because I've known about client-side scripting using hidden IFrames since my internship at Microsoft in 1998 (DHTML had just come out in IE4 and we were working on a product that made heavy use of this trick), but I didn't have the foresight to realize that this would eventually become a standard platform. At the time it just seemed like a massive hack to fake a crippled UI inside an application that was never meant for UI's.

Given enough market demand, though, people will find a way turn the most hostile environment into a home. Even as hostile as HTTP and HTML. Applications like this are convincing me. (As are all the amazing things coming out of Google these days.)

Incidentally, I also consider this a big win for high-level languages that Java completely lost to Javascript as a client-side web application development platform.

v. smart!

Philip, with the rise of Ajax, it will become increasingly obvious that you are on the right track.

Ajax

But isn't Links about XML, SQL and other server-side stuff, while Ajax is about heavily Javascripted client-side interfaces (plus GUI, so lots of CSS and JS)?

Maybe the killer technology would be a Common Lisp in JavaScript and a declarative layout language that compiles to CSS :)

(yes, I've tried size-independent layout with CSS; it just doesn't work; sometimes 10px might be understood, but 1% isn't and stupid stuff like that)

Links meets Ajax

Ajax is mostly client-side, but it still needs to make server requests, and deal with the local DOM, etc., so if Links provides an excellent framework for doing Ajax software development (i.e., JavaScript, CSS, XMLHttpRequest, etc.), that should be a definite win.