LtU Forum

Everybody Needs a Syntax Extension Sometimes

Sebastian Erdweg, Felix Rieger, Tillmann Rendel, and Klaus Ostermann, to appear in Proc. of Haskell Symposium, 2012.

Programmers need convenient syntax to write elegant and concise programs. Consequently, the Haskell standard provides syntactic sugar for some scenarios (e.g., do notation for monadic code), authors of Haskell compilers provide syntactic sugar for more scenarios (e.g., arrow notation in GHC), and some Haskell programmers implement preprocessors for their individual needs (e.g., idiom brackets in SHE). But manually written preprocessors cannot scale: They are expensive, error-prone, and not composable. Most researchers and programmers therefore refrain from using the syntactic notations they need in actual Haskell programs, but only use them in documentation or papers. We present a syntactically extensible version of Haskell, SugarHaskell, that empowers ordinary programmers to implement and use custom syntactic sugar.

Building on our previous work on syntactic extensibility for Java, SugarHaskell integrates syntactic extensions as sugar libraries into Haskell’s module system. Syntax extensions in SugarHaskell can declare arbitrary context-free and layout-sensitive syntax. SugarHaskell modules are compiled into Haskell modules and further processed by a Haskell compiler. We provide an Eclipse-based IDE for SugarHaskell that is extensible, too, and automatically provides syntax coloring for all syntax extensions imported into a module.

Paper describes an approach to extensible Haskell syntax. The whole concept looks like it could be integrated to GHC with language extension mechanism, which is easier to use than external preprocessor (though it is written in Java). When comparing to TemplateHaskell, authors emphasize that their extension is modular and composable, what cannot be said about TH. Sources on GitHub.

Social Influences on Language Adoption

This paper quantitatively analyzes why some programming language succeed and others fail. We analyze several large datasets, including over 200,000 SourceForge projects and multiple surveys of 1,000-13,000 programmers. We observe trends in language popularity and adoption. Popularity follows an exponential curve: the tail accounts for only insignificant development effort and the top few languages succeed across a wide range of domains. Examining adoption, we find that social factors usually outweigh technical ones. In fact, the larger the organization, the more important social factors become. Likewise, developers are willing to adopt new languages, but are heavily shaped by their education. Developers prioritize expressivity over correctness, and perceive static types to be more helpful for the latter than the former. Taken together, our results help explain the process by which languages become adopted or not.

Paper by Leo Meyerovich and Ariel Rabkin, ostensibly a part of their Socio-PLT effort.

Should continuation capture always be considered to be stack unwinding?

Take the case of using continuations to implement coroutines: when a coroutine yields, do we really want to run unwind handlers (i.e. dynamic-wind)? After all, we are still "in" the coroutine, only its execution has been suspended, and is expected to continue at the point of suspension. It would seem strange to run the unwind handlers in this case, no?

Would it make sense to have two forms:

  • (dynamic-wind pre-thunk thunk post-thunk) as in Scheme: pre and post thunks are always performed, whether control enters/exits the thunk normally, or through an exception, or through continuation capture.
  • (unwind-protect thunk post-thunk) as in Common Lisp: post thunk is not performed if control exits thunk through continuation capture, only if it exits normally or through an exception.

Alien worlds, values, and you can't touch this

Conventional programs operate on values from one coherent "world;" e.g., many programs execute on a Von Neumann machine (i.e., the CPU world) where values are stored in registers or in uniform access memory while GPU-based programs operate in a very different world with severe computational and memory restrictions. Other examples of these worlds include distributed and parallel computing environments (futures, map reduce clusters), persistent data stores (relational and graph databases), worlds where time is abstracted (FRP), worlds where space is abstracted (GPU, various databases, array programming languages), and so on.

Typically, our bridge between different worlds is a very coarse API where code on the other side is often written in a different language, but DSLs/libraries like Bling can allow us to manipulate from the CPU world resources and values from other worlds through various wrapping and indirection tricks. So in Bling we can manipulate a timeless FRP-world where signal-like values are composed and routed between UI components, or we can write some C# code that operates on GPU-based values to express vertex layout and pixel colors.

Such techniques are hardly transparent: we must firewall values between different worlds as they either do not mix or how they mix is very tricky and possibly very expensive. An int on the GPU is very different from an int on the CPU even if they both support "+;" given a + b, where a is a GPU values and is a CPU value, the result is a GPU value and we must add extra logic to route b onto the GPU at runtime (as a global shader parameter). We could even bring the GPU int back to the CPU somehow, but only at extreme cost that involves transferring GPU memory to CPU memory; its not something that should be allowed to be done lightly.

Haskell seems to have great support for world bridging, possibly because its own world is so idealized that bridging is easier as well as often necessary. My question, and idea, is would it make sense to add specific support in a programming language for world bridging; i.e., should a language have specific support for alien values that do not come from its own native world? Alien values will often share data types with the host world; e.g., a CPU int vs. an alien int, and many operations on those data types will often have some kind of reasonable interpretation (e.g., gpu_a + 1 yields a gpu value that is formed from an add computation). What sort of abstractions would be necessary to allow for as many safe compositions, preventing unsafe compositions, and making expensive operations very obvious?

Further work on expansion-passing style?

Expansion-passing style: A general macro mechanism [pdf] was published in 1988. It proposes a system that increases macros' flexibility by allowing them control over the expansion of their subforms.

Since EPS came out at the same time as lots of new work was being done on hygienic macro systems, it seems not to have gotten the attention it deserved. At least I can't find any work based on it. Is anyone aware of subsequent work that I missed? What about any work that, while not overtly related, might address the issues at the end of the paper?

Crafting a toy language for learning purposes

Hi lads,

I'm considering designing a basic, simple toy language that's able to compile itself, mainly for learning/fun purposes. I'll be targetting something that can be done over weekends as a hobby. One thing I'm particularly interested in, in order to bootstrap, is have clang as backend, and somehow use bindings in the compiler to generate code. Any pointers to tutorials, blogs or even github source code of similar projects (as small as possible for me to actually understand) would be really nice.

Thanks

Pythonect 0.4.0 Released

Hi All,

I am pleased to announce the release of Pythonect 0.4.0, available from https://github.com/downloads/ikotler/pythonect/Pythonect-0.4.0.tar.gz

Release notes can be found at: https://github.com/ikotler/pythonect/wiki/What%27s-New-In-Pythonect-0.4

This version fixes several bugs and adds some important features.

Many thanks to everyone who contributed bug reports and feedback that went into this release!

Regards,
Itzik Kotler

Semantics of the dodo language

This page introduces the semantics of the dodo language, as I imagine them.
http://dodo.sourceforge.net/semantics.html

It is not complete and I will continue to add to it. But this should be enough to start a discussion.

Small quote:

An object is a structure variable which has a hidden attribute Class which refers to its interface. This interface is used to access attributes and call member functions of the variable, without a need for the caller to know the exact type of the variable.
Example:

object1.draw(paper)

A class is a type associated with an object instance.

Presentation at the Berlin Compiler Meetup on programming with algebra

On 7 August I will speak at the Berlin Compiler Meetup on SubScript, my extension to the Scala language, based on the Algebra of Communicating Processes (ACP). According to Google Streetview the location is near (above?) trattoria La Scala (!) at Rosenthaler Strasse.

You may view ACP as an extension to boolean algebra, but also as a BNF-like formalism with support for parallelism, next to sequential and choice composition. Combining it with a "normal" language such as Java or Scala eases event-driven and concurrent programming, as used in GUI controllers, text parsing and discrete event simulations.

See the Subscript project site at Google code.
If you are a student looking for a project on a subject such as concurrent programming, you may find one here.

Are nested SQL statements monads?

Today I suddenly had an epiphany: nested SQL statements are monads in the way that they enforce orders in execution.

Could someone tell me I'm correct? :-)

XML feed