Deca, an LtU-friendly bare metal systems programming language

The Deca programming language is "a language designed to provide the advanced features of sophisticated, high-level programming languages while still programming as close as possible to the bare metal. It brings in the functional, object-oriented, and generic programming paradigms without requiring a garbage collector or a threading system, so programmers really only pay in performance for the features they use." The latter link provides a list of features that Deca does, will, and won't provide. Features provided include type inference, universally- and existentially- quantified types, and "a strong region-and-effect system that prohibits unsafe escaping pointers and double-free errors".

The Deca language and ideas behind it are documented in a thesis, The design and implementation of a modern systems programming language (PDF):

Low-level systems programming has remained one of the most consistently difficult tasks in software engineering, since systems programmers must routinely deal with details that programming-language and systems researchers have preferred to abstract away. At least partially, the difficulty arises from not applying the state of the art in programming-languages research to systems programming. I therefore describe the design and implementation of Deca, a systems language based on modern PL principles. Deca makes use of decades in programming-languages research, particularly drawing from the state of the art in functional programming, type systems, extensible data-types and subroutines, modularity, and systems programming-languages research. I describe Deca's feature-set, examine the relevant literature, explain design decisions, and give some of the implementation details for Deca language features. I have been writing a compiler for Deca to translate it into machine code, and I describe the overall architecture of this compiler and some of its details.

The source code for the Deca compiler, decac, is available here. The compiler is implemented in Scala and generates LLVM bytecode. (The author points out in the comments below that this implementation is a work in progress.)

The author of Deca is LtU member Eli Gottlieb, who back in 2008 posted in the forum asking for feedback on his language: Practical Bits of Making a Compiler for a New Language.

There's some more discussion of Deca over at Hacker News.

Of critical interest

Page 6:

However, systems programming sometimes requires relaxing even the most useful safety restrictions. For example, kernel virtual memory subsystems on the x86 architecture[13] must often allocate memory at page boundaries, and then treat the newly-allocated page as an array of 32-bit pointers. This cannot be done via a language runtimeâ€™s heap-allocation algorithms that donâ€™t respect page boundaries, since a true systems language cannot actually impose the use of a specific runtime library on its users. Deca thus provides an â€œescape hatchâ€ for violating type-safety, when necessary, in the form of bit-cast expressions. These accept an expression and reinterpret the physical memory occupied by the result as having an arbitrary given type, asserting that type to the type-inference system without constraining the type of its input at all. They are equivalent to C++â€™s reinterpret_cast[14], though they operate on any data types rather than just pointers by allocating space for the resulting type and storing the source data into that space. Since cast expressions are typed as their result type, they integrate with the rest of the
type system.

So it's not completely type-safe if you use CAST expressions, but it seems like a good balance overall. Pointer layout declarations like tagged pointers, and declarative memory assignments are always the trickiest patterns in systems code, so I always look for that first in any new systems programming language. I'm curious if any language has a completely safe solution to these two patterns.

By naasking at Mon, 2012-01-02 03:45 | login or register to post comments

That's currently an unfortunate "blind spot" in Deca.

I know that LLVM has arbitrary-length integers that could be used for tagged pointers as follows, for example:

type TaggedPointer = [i30,i2]


function deconstructTaggedPointer<T>(p: TaggedPointer) {

  cast<@T>(p)

}

function taggedPointer<T>(p: @T) { cast<TaggedPointer>(p) }

This would require a bit of boilerplate code, but adding a language construct to appeal to programmers assuming the bit-width of pointers.... seemed wrong somehow. With subtyping and physical subtyping now cleanly separated in the type system, it's certainly a good idea in search of an elegant implementation.

Could you clarify what you mean by declarative memory assignments? Googling the term brings up neurology papers.

By Eli Sennesh at Mon, 2012-01-02 04:11 | login or register to post comments

adding a language construct

adding a language construct to appeal to programmers assuming the bit-width of pointers.... seemed wrong somehow.

For most code, it is wrong. But there's a lot of low-level code out there that does need these sorts of representation, like language runtimes, garbage collectors and page mapping logic as found in kernels. For instance, Haskell sometimes uses 2 low-order bits to represent a sum tag.

This is why BitC includes a standard optimization in the spec for representing certain tagged pointers, although even BitC's implementation is rudimentary at best (it can't solve the Haskell sum tags for instance).

Like I said, it's one really hard abstraction to represent safely, which is why I always look for it first in any new systems language.

Could you clarify what you mean by declarative memory assignments?

Sorry, I just literally mean assigning a fixed memory location. In C/C++:

int* p = (int*)0x23aef4b;

Been a few years, but in Ada, it's something like:

p'Address := 0x23aef4b;

Assigning fixed addresses is required in driver code that integrates directly with hardware.

Edit: if pointer alignment were tracked in the type system, this would allow for word-sized tagged unions. This way the compiler could verify that X low-order bits are available for 2^X cases contained in those bits.

By naasking at Mon, 2012-01-02 15:27 | login or register to post comments

Oh, ok.

Well, now that subtyping and physical subtyping are cleanly separated in the language, rewriting the numerical types to just specify their bit-width (eg: int16 for signed 16-bit, nat32 for unsigned 32-bit) would work. I would just have to define width-subtyping correctly over integers and make a special case for the long integer to FP128 subtyping. The width-subtyping there could also enable physical subtyping on integers... nice.

As to assigning fixed addresses, it's doable via escape-hatch in Deca. You would CAST from the integer form of the address to an appropriate pointer type. CAST is Deca's unsafeCoerce: it will let you subvert the type system arbitrarily, make it believe any assertion of safety that you please, but on your head be the consequences of your own mistakes.

I would call that the philosophy behind Deca: "you can shoot yourself in the foot... provided you turn off the safety first."

By Eli Sennesh at Mon, 2012-01-02 19:06 | login or register to post comments

As to assigning fixed

As to assigning fixed addresses, it's doable via escape-hatch in Deca. You would CAST from the integer form of the address to an appropriate pointer type.

Yes, I realize this is unsafe and difficult to support properly, which is why when any new safe systems language comes along, I always look up how it handles this case. For instance, it would be "safe" to allow a coercion from an int to a typed pointer provided you can statically guarantee that no other operations can accidentally alias that location and view it as an incompatible type, ie. via an offset from some other address, or another fixed address that overlaps with an incompatible type.

Regarding subtyping on integers, I'm not sure what that's supposed to address. Do you foresee it addressing word-sized tagged unions, like this C fragment:

struct Tagged {
  union {
    int integer;   //tag = 1
    void *pointer; //tag = 0
  } bits : 31;
  int tag : 1;
};

Preferably, the Tagged.bits.pointer would also specify the type pointed to.

By naasking at Mon, 2012-01-02 21:17 | login or register to post comments

Well this is an honor. Also embarrassing

And here's me still reimplementing most of the compiler in the "newsig" branch on the hg. I had some nice little examples compiling, but I then found that my compiler was buggy crap at emitting references to symbols external to the current module. I also didn't have a proper region or effect system in that iteration.

So this branch has: cleaner implementation of type system, effects, regions, mutability control on slots (with sub-mutability-ing), and a general cleaner implementation of everything (especially the importation of symbols).

The embarrassing bit? All this still needs more coding to get it up to where the compiler used to be, and I haven't updated the doc (aka language report aka undergraduate thesis) lately.

But it is all moving forward, slowly but steadily! LtU has been a MASSIVE help over the years (alongside my professors) in educating me well enough to stick with clean, well-known PL solutions for most things. Oh, and without LtU, I would never have seen Ben Lippmeier's Disciple language and associated papers, which were THE chief inspiration for doing the extra work of having proper regions and effects in Deca.

So yeah.

By Eli Sennesh at Mon, 2012-01-02 04:22 | login or register to post comments

This is it

Eli Gottlieb: Oh, and without LtU, I would never have seen Ben Lippmeier's Disciple language and associated papers, which were THE chief inspiration for doing the extra work of having proper regions and effects in Deca.

Then, as far as I'm concerned, LtU has fulfilled its raison d'étre.

By Paul Snively at Mon, 2012-01-02 04:26 | login or register to post comments

Ãªtre, ou ne pas Ã©tre

By (deliberately?) misspelling Â«â€‰Ãªtreâ€‰Â», you raise an important and relevant point. Should the French orthography be overhauled to rely on the inference of lâ€™accent circonflexe a lÃ Hindley-Milner â€” much the same way that the diaeresis Â¨ is routinely inferred in the English reÃ«lection and coÃ¶peration or the Russian Ñ‘Ñ€Ñˆ (usually spelled ÐµÑ€Ñˆ) â€” or should the French continue to require the explicit presence of this (type) annotation? Telle est la question. It is true that lâ€™accent circonflexe serves an occasional useful purpose of being the sole distinction between two different words, e.g.:

il eut mangÃ© de la brioche;
quâ€™il eÃ»t mangÃ© de la brioche.

But is it really needed? In the dark ages spanning the period between the Visigoths and Guillaume le ConquÃ©rant, the use of lâ€™accent circonflexe was aimed at making French more palatable to the speakers of English. To wit, Ã®le â†’ isle, forÃªt â†’ forest, hÃ´tel â†’ hostel, hÃ¢te â†’ haste, etc. But nowadays, it's about as redundant as explicit type declarations in Java.

Re: comment 68743: â€œLtU has fulfilled its raison d'Ã©tre [sic]â€.

LtUâ€™s raison dâ€™Ãªtre is its repeatedly demonstrated ability to turn any discussion into a PR campaign for the greater use of (static) typing, nâ€™Ãªt-il pas vraiâ€‰?

By el-vadimo at Thu, 2012-01-05 16:53 | login or register to post comments

Very Kind...

...of you to assume my error was deliberate. :-)

As for the rest, I personally am delighted when there's research that seems to really delve into what not having a type system can offer. My current poster-child for this is John Shutt's Kernel work. My slogan now is "take the phase distinction seriously or don't have one." What continues to be logically incoherent is the claim that dynamic typing has any positive benefit over static typing that isn't isomorphic to researching (non)computability itself. I also can't help it with respect to LtU if the preponderance of PLT research reflects the fact.

By Paul Snively at Sun, 2012-01-08 18:29 | login or register to post comments

LLVM

I think it's a bit funny that almost no one does "systems programming" anymore other than people who are doing embedded systems and those are plenty; probably number three after web and enterprise programming. So most people who are programming in C and assembly, who actually do safety critical and bare-to-the-metal stuff have zero benefit of any modern "systems programming language" [1]

So maybe meeting an EE professor at lunch or another one who teaches embedded systems and who is interested in safe and efficient NFC transactions or something alike may turn out to be inspiring. It's probably not really important and we don't urgently need any new programming language anyway but it would be interesting from an artistic point of view.

Rob Pike rants about Go being misunderstood and people refuse to make experiences in favor for judging super-quick but he has also a hard time acting in a niche that is already overcrowded.

[1] Me and my team mates used C++ a lot 4-8 years ago but it was all application programming for which we could have easily used Java or C# instead.

By Kay Schluehr at Mon, 2012-01-02 06:26 | login or register to post comments

Systems collapse or evolution around a roadblock?

Is there no reason for a systems-programming language beyond C anymore, itself doomed to maintenance and legacy tasks while everyone lives on in virtual realities (sorry, I mean machines), or did virtual machines and very-high-level applications languages become dominant because nobody could agree on what the hell to do with systems programming?

I'm passing the buck: let's ask the people who did or are doing BitC, Cylone, Clay, Tart, and Rust. Then let's talk to the folks at UCSD who were doing a "Liquid Types" project as an extension to C for systems programming, and the folks at Portland State University doing Habit, Haskell-for-kernels. And I've heard rumors of a certain other institution (which I'm not naming because I haven't seen a public announcement of the project, but that may be my own ignorance) getting a big grant for highly-secured systems programming languages. And Galois certainly does well for themselves.

Assuming my answer to this question of "modern systems programming" is completely and utterly insufficient, how can we dismiss the question itself when we hear through scientific peer-reviewers and funders that they would all like to see someone take a stab at that question?

Many years ago I read a presentation entitled, "Systems Software Research is Irrelevant". Tell that to a researcher I took a course under, named Emery Berger, who has made his career on systems research. Hell, tell it to the "How to have a bad research career" presentation I also read, one of whose bullet points was "DO NOT do a systems project!".

It seems like there's unfilled demand for better systems software and better languages and runtimes for producing it. Unfortunately, for some reason the Problem of Systems Software continues to be consider "solved". Of course, Plan 9 was very good ;-), but it came out right in time to kill itself as the shift from timesharing to personal computers happened.

By Eli Sennesh at Mon, 2012-01-02 08:32 | login or register to post comments

...or did virtual machines

...or did virtual machines and very-high-level applications languages become dominant because nobody could agree on what the hell to do with systems programming?

Something like that. I would lay much of the blame on automatic garbage collection, which while extremely useful, created a hard dividing line that sucked language innovation away from systems languages.

One of the challenges here is that the target audience has a severe case of Stockholm Syndrome - they're so used to the pathological ecosystem they've learned to live with that they can't imagine why anything else would be needed.

By Anton van Straaten at Mon, 2012-01-02 09:14 | login or register to post comments

A great fraction of systems

A great fraction of systems programmers uses languages from the 1970s and no one cares.

No one also cares about capability security and native sandboxing on Smartcards. Instead we have JVM technology there for code loaded from 3rd parties with its considerable overhead. There is zero research about those topics, except from some companies which do it by themselves. There is an increasing number of university projects that deals with hacking cars, trains and planes. Scary, no? Unlike PCs where software always put pressure on customers buying new generations of hardware there isn't any such pressure in the embedded sector. Since it employs armies of testers anyway it may be of little surprise that it is relatively open minded towards formal methods.

No, systems software research is far from done, but the 10th billion language that targets x86 is probably not so urgently needed as many programmers might think.

By Kay Schluehr at Mon, 2012-01-02 10:16 | login or register to post comments

About caring

A great fraction of systems programmers uses languages from the 1970s and no one cares.

You could have said this in the early 1990s about the possibility of languages like Java and Python. Turns out that although it was possible to have the perception "no one cares" at that time, once better alternatives were available, a majority cared.

No, systems software research is far from done, but the 10th billion language that targets x86 is probably not so urgently needed as many programmers might think.

Again, same issue.

In any case, this all seems rather off-topic in this thread. Since LtU is a site devoted to discussion of programming languages, a PL version of the quote by New Scientist editor Alun Anderson applies. New programming languages with interesting feature sets are inherently interesting, and if you don't agree...

By Anton van Straaten at Mon, 2012-01-02 20:25 | login or register to post comments

New programming languages

New programming languages with interesting feature sets are inherently interesting, and if you don't agree...

I studied mathematics as a "brotlose Kunst" for a couple of years. So this has my sympathy. And yes, maybe someone still finds something new even in overcrowded fields and searching under the lamp.

By Kay Schluehr at Tue, 2012-01-03 06:35 | login or register to post comments

Pick a hard problem, drill at it until it throws off new ideas

This language has actually ended up spawning a few possibly-new ideas: multi-method type-classes (which need a lot of work), a design pattern that uses implicit parameters to capture modules from the environment (which seems kinda trivial), and the use of subtyping-esque mechanisms to control the mutability of slots across pointers (this one's the most well-fleshed-out and researchy).

When everything is solved, go find harder problems. Hence systems programming languages instead of regular PL (not everything is solved in PL, but its increasing subspecialization is a sign).

By Eli Sennesh at Thu, 2012-01-05 04:50 | login or register to post comments

â€œYou and Your Researchâ€

Hamming's book is a great treatment of this topic.

By el-vadimo at Thu, 2012-01-05 13:40 | login or register to post comments

David Jensen gave my "Research Methods" class a lecture on that

Really great stuff. I do have to admit, the issue of whether this work will be scientifically relevant has occasionally kept me up o' nights. Well, it would have if I wasn't exhausted from sleep deprivation.

On the other hand, it did throw off that last idea that I'm typing up into a little draft of a pretend workshop paper right now. I've thought a bit about it, and I've realized that one's actually pretty relevant to both PL and systems folks. Mutability and synchronization are big issues everywhere.

By Eli Sennesh at Thu, 2012-01-05 23:27 | login or register to post comments

I do have to admit, the

I do have to admit, the issue of whether this work will be scientifically relevant has occasionally kept me up o' nights.

When Hamming started to work in the early 1940s you could still know everyone by name who worked as a theoretical physicist, not to talk about the borders of 'communication', a term which wasn't even used in science before Shannon introduced it, and math. Scientists were just discovered to be an important strategic resource for the wealth and competition of nations who could actually be produced, as "organization men" [1] instead of being individual and accidental geniuses.

When I studied, the whole perception was already organized. The "important problems" were known and famous and many people flocked on them, often in global cooperation and individual competition. This also happened in the "pure" sciences which were not capital intensive "Big Science" in the way Los Alamos, the lunar landing project of the NASA or the LHC are. Superstring theory is a lively example and it was a great fashion in the 1990s - doing speculative mathematical-physical research was not much debated as it became a decade later. In some sense the research network already answers the question about what is important and what is not and this can be measured in terms of citations, public attention, financial grants and human resources which can be acquired using grant money. Since the network answers the nightly questions for you, you don't have to.

I'm not sure if young researchers don't get all those capitalist science management and marketing aspects much better than I did. I can't remember that me and my friends reflected the science and research market at all, speculating about what was "promising", which communities are probably interested and so on. It was still all romantic heroism: great people who are doing extraordinary things. That's actually a very beautiful, mythic tradition, which is kept alive through story telling a la Hamming. We felt not quite like stooges, as Hamming did early in his career, but also other-directed without knowing what to oppose to and the only way we found to rebel against it was to drop off.

[1] I refer to William Whyte's famous sociological study from 1956.

By Kay Schluehr at Fri, 2012-01-06 08:35 | login or register to post comments

It's probably not really

It's probably not really important and we don't urgently need any new programming language anyway but it would be interesting from an artistic point of view.

I disagree, it is still important. In what language are you going to write your safe language runtime and garbage collector, if not a systems language that can perform safe bit-twiddling on pointers?

By naasking at Mon, 2012-01-02 15:31 | login or register to post comments

In one that allows the use

In one that allows the use of something like an "unsafe/systems" package.

Some examples, Modula-2, Modula-2+, Modula-3, Oberon, Ada, D.

By pjmlp at Mon, 2012-01-02 18:24 | login or register to post comments

The point is to eliminate

The point is to eliminate the unsafety entirely. Isolating unsafety is a great first step, but not quite enough. Other than typed assembly language, I don't think there's a "systems language" in existence that safely deals with tagged pointers. Actually, I'm not even sure whether typed assembly language supports this usage.

By naasking at Mon, 2012-01-02 18:48 | login or register to post comments

Burroughs B-5000

As it seems this has been done already in the 60's with ALGOL, unfortunately it did not catch on.

By pjmlp at Mon, 2012-01-02 20:37 | login or register to post comments

I am all for safer systems

I am all for safer systems programming languages.

Many of the security exploits we suffer, are a consequence of C and to lesser extent C++ use.

I am a firm believer that if a more type safe language would be used, there would not exist so many security issues.

The problem is that it takes generations to bring the programmers to new paradigms.

By pjmlp at Mon, 2012-01-02 18:28 | login or register to post comments

The problem is that it takes

The problem is that it takes generations to bring the programmers to new paradigms.

That's not so much a "problem" as it's merely a constraint.

By james woodyatt at Tue, 2012-01-17 03:26 | login or register to post comments

i wish it were also a business opportunity

i mean, shouldn't it be? in this day and age, shouldn't even grandmas be interested in things that don't leak their credit card #s? why hasn't spark ada taken over everything by now? :-)

By raould at Tue, 2012-01-17 22:33 | login or register to post comments

Sure it should

But we all know that security doesn't sell well, so it's no surprise that managers don't take into account security when selecting a language.

By renox at Wed, 2012-01-18 09:10 | login or register to post comments

Spark Ada cares for not

Spark Ada cares for not storing credit card information as plain text in a data base? Sounds like a pretty cool language feature.

By Kay Schluehr at Wed, 2012-01-18 14:25 | login or register to post comments

Related works: ATS

I'm surprised not to see ATS mentioned among Clay, BitC and Cyclone in the summary.

Looks very interesting anyway, I'll have a look as soon as possible.

By gasche at Mon, 2012-01-02 06:51 | login or register to post comments

ATS is dependently typed and used for proofs

I would call it the next tier up of PLT rigor from Deca, Clay, BitC, Cyclone. None of this set offer theorem-proving capabilities or dependent typing; ATS thus has the honor of being the only Agda competitor for systems programming.

By Eli Sennesh at Mon, 2012-01-02 08:24 | login or register to post comments

I was thinking of the

I was thinking of the low-level aspects of ATS (linear types, destructive pattern matching, non-mandatory garbage collection) rather than the dependent type / proof aspect, which is indeed less relevant. I don't think the focus of ATS is on theorem proving, and thus I wouldn't call it an "Agda competitor" (even if Agda itself is on the edge between a programming and a proof system; at least I certainly wouldn't call ATS a Coq competitor).

I think the possibilities offered by ATS in this low-level area are quite nice, and only partially related to the expressiveness of the dependent type layer -- I'm not sure which low-level parts really rely on it, though I can certainly see it being helpful to avoid eg. some dynamic bound checking.

By gasche at Mon, 2012-01-02 12:48 | login or register to post comments

Idris

ATS thus has the honor of being the only Agda competitor for systems programming.

No it doesn't! Edwin Brady's Idris is quite adequate for the job. There's a new version out, I warmly recommend checking it out. The implementation is usable and Idris syntax will probably appeal more than the ATS one to people with a Haskell background.

By jack at Tue, 2012-01-03 12:41 | login or register to post comments

As I claimed above, ATS is

As I claimed above, ATS is *not* an Agda competitor. I agree that Idris can be seen as an Agda competitor, but on the other hand I wouldn't claim it is about "systems programming" in the sense used by Eli here, where you try to write programs without any dynamic memory allocation. I would rather describe it as a (less proof-theoretically sophisticated) Agda with a viable C FFI rather than a viable Haskell FFI; at least the previous version that I had looked at.

ATS is much lower-level than that: it allows to avoid GC, to manipulate unboxed binary data, and even to perform pointer arithmetic. You could consider rewriting a C program entirely in ATS, you can write a kernel driver in ATS today (or rather, in 2007). I'm not sure that's why Idris is aiming at.

PS: there is a lot of ambiguity around what "systems programming" mean. In this thread I am interested in "systems programming" in the sense of "the kind of C code you write for embedded systems or OS kernel development", but there are more high-level views of "resource-conscious software layers that are below the end-user programs the user interact with", a bit like the Go tribe advertise. I suppose both are equally reasonable use of the term, but it's still interesting to make a distinction between languages following the low-level or the high-level vision.

By gasche at Tue, 2012-01-03 14:06 | login or register to post comments

You could consider rewriting a C program entirely in ATS

You could consider rewriting a C program entirely in ATS

I've been doing this a bit lately, converting and writing new versions of some C based programs I've done into ATS. From usage, I feel ATS is more towards the Cyclone/BitC level of programming than the Agda level of programming.

By doublec at Wed, 2012-01-04 01:07 | login or register to post comments

[Meta] /deca URL

Anton, how did you give this story a custom URL?

(I should also add that while I appreciate the ease of linking provided by such URLs, I'm unsure whether it's worth to give up LtU's sequential node numbers - they make it easy to discern the approximate age of a post.)

By Manuel J. Simoni at Mon, 2012-01-02 17:21 | login or register to post comments

In the story submission

In the story submission form, there's a "Path alias" field that determines this. It's possible the field only appears for admins, though, not sure about that.

I hadn't considered using the LtU node id as an alternative calendar for the history of programming languages... That's such a post-node-4424 idea!

By Anton van Straaten at Mon, 2012-01-02 20:17 | login or register to post comments

Too soon

There are still people suffering through node-4424.

By Matt M at Mon, 2012-01-02 21:25 | login or register to post comments

Why not Ada?

While I haven't used Ada much, I thought that you could use it as a "system programming" language..

This is not an advertisement for Ada, but more a generic question: why do people still compare new system programming langages to C instead of Ada?
Sure C is much, much more popular/used, but in terms of feature I think that the one to beat/compare to is Ada, not C..

By renox at Tue, 2012-01-03 00:12 | login or register to post comments

why

why do people still compare new system programming langages to C instead of Ada?

For the reason you mentioned, C is much much more popular. The speed of hardware is defined in terms of how well it executes C code.

It is precisely the feature rich aspects that make me not even sure I'd consider Ada to be a systems language. C conversely has features that are reflective of what hardware can do, not what programers want to do.

By JeffB at Tue, 2012-01-03 18:50 | login or register to post comments

Excellent quotes from the paper

"Right now, Decaâ€™s object system provides its only means to support overloading and ad-hoc polymorphism. This is because the literature on overloading via type-classes remains so shrouded in Haskell culture and terminology that I have had trouble understanding it well enough to add to a language other than Haskell."

"Allocation isnâ€™t a real side-effect, it just looks like one!"

"Papers with sound findings do get rejected, after all, and some
unknown logic mistake could always remain."

The best (from the dedication): "I am become type theory, destroyer of minds."

By johnwcowan at Tue, 2012-01-03 07:19 | login or register to post comments

And I was hoping to someday live that down!

Please forgive amateur work by an undergraduate.

By Eli Sennesh at Thu, 2012-01-05 03:13 | login or register to post comments

Actually, there's a story ahead of those...

When I got out of last Spring semester, I went home and basically slept for a week. Then, over the summer, I managed to catch up on most of the papers I had missed or not been able to understand.

Effect systems? Check.
Type-classes? Check, though I eventually decided to use implicit parameters instead (but with Deca's scoping rules instead of Scala's, and a notion of final implicits that can't be overridden). Of course, the funny thing about type-classes and multimethods is how alike they are, and yet how utterly different. I want to eliminate a feature in favor of an equivalent or more powerful feature, but I can't.

"Allocation isn't a real side-effect" is a paraphrasing of the popular view in "mainstream" PL, such as Haskell, where allocation is performed ALL THE TIME without the IO or St monads becoming involved, for some strange reason. In truth, allocation looks like a side effect to the allocator itself, but looks pure to the client of allocation.

My paper *did* get rejected, and in fact about 10% of the criticism in the reviews *actually* clued me in to errors. On the upside, the arXiv copy got noticed and I was asked if I might resubmit to some conference when I fixed things. That's done, and by now I'm experiencing the "knowledge rises, expectations fall" effect described in PhD Comics. This iteration of the paper will probably be rejected (almost all papers are), but if all the reject-reviews are for aesthetic reasons, I'll at least take the author appeal.

By Eli Sennesh at Thu, 2012-01-05 04:04 | login or register to post comments

I want to eliminate a

I want to eliminate a feature in favor of an equivalent or more powerful feature, but I can't.

You mean something like predicate dispatch?

"Allocation isn't a real side-effect" is a paraphrasing of the popular view in "mainstream" PL, such as Haskell, where allocation is performed ALL THE TIME without the IO or St monads becoming involved, for some strange reason.

Depends who you ask. Region-based memory management clearly considers allocation a side-effect. Most languages don't because then just about every language is effectful.

By naasking at Thu, 2012-01-05 16:29 | login or register to post comments

Not full predicate dispatch, still short of that

You mean something like predicate dispatch?

Multimethods, particularly hierarchical unambiguous multimethods as invented by Millstein et al and used by myself, end up acting pretty much (not quite the full moral equivalent, but close enough) like a pattern-match done at runtime on the variant tags or class-tags of the actual data.

Type-classes and implicit parameters match on static types in much the same way.

So the question is: could we have hierarchical, dynamically-dispatched type classes? Can we assemble a tree of possible dictionaries at compile time (just as we do for the hierarchical multi-methods). At runtime, we use a hierarchical tag-match to select the most appropriate of those known cases and dispatch through the appropriate dictionary.

Why would we even want that? Because that would give a multimethod system one of the most desirable features of both Haskell and Java: the guarantee that a class implementation must implement all the relevant methods.

There's also an obstacle, at least as Deca sees things. Type-classes and implicit parameters are usually considered moral equivalents. They are not, however, semantic equivalents, and this extension shows up the difference: you can't use matching a tag tree to fill in an implicit parameter of a dictionary because the function parameters are contravariant. The dictionary selected based on the tag matching wouldn't subtype or type-equal the implicit parameter's type.

And if we go for full Haskell-style type-classes, does the definition of an instance method see its own dictionary, with type-class methods having the correctly specific signature? What sort of semantics does that have? I can't see anything outstandingly wrong about that, I think that's how Haskell mostly does it, but I'm not sure the semantics remain sound in the presence of subtyping and hierarchical dictionary selection.

Of course, I would normally want to keep the implicit parameters, because as noted sometimes on LtU, they have their own "perks" aside from being usable for type-classes.

So yeah, multimethod type-classes sound nice but need a lot of work, semantically.

Depends who you ask. Region-based memory management clearly considers allocation a side-effect. Most languages don't because then just about every language is effectful.

Well at the systems level, just about every language should be considered effectful. House's (as in House the Haskell kernel) monadic treatment is one way of handling it, Deca's importation of effect-system work is another, but you do need some way of handling it.

By Eli Sennesh at Thu, 2012-01-05 23:17 | login or register to post comments

Because that would give a

Because that would give a multimethod system one of the most desirable features of both Haskell and Java: the guarantee that a class implementation must implement all the relevant methods.

I wasn't aware that Deca's multimethods weren't statically checked. Some references that may help:

Type-checking Modular Multiple Dispatch with Parametric Polymorphism and Multiple Inheritance -- directly relevant to multimethods
Extensible Programming with First-Class Cases -- perhaps indirectly relevant, but in principle it does the same thing you're after, ie. gathers disparate type and case definitions into single case statements and checks for exhaustiveness.

Well at the systems level, just about every language should be considered effectful.

Oops, my sentence should have read "just about every function is effectful (because it allocates)". Don't know what I was thinking...

By naasking at Fri, 2012-01-06 02:55 | login or register to post comments

Irony!

Deca's multimethods are statically checked, according to exactly the methods given in that first paper, which I cite. decac can statically verify that every call to a multimethod has a non-empty set of valid possible cases onto which to dispatch. It helps to use Deca's terminology here: "methods" are the generic functions, "overrides" are the defmethod'd cases.

What I meant was, consider a type-class in the Haskell sense, say Eq a. We all know it has two methods, two functions that must be implemented to provide a valid dictionary for any a. Java's Comparable interface can express the same thing.

So how do we express that in a multimethod system? As I've seen multimethods traditionally presented, we can't... YET. There's no major theoretical reason why we can't, we just haven't figured out a clean, non-bloated syntax and semantics for expressing that some class/variant/ADT/$WHATEVER must not only implement one multimethod but several.

If we didn't need to have default cases for statically safe multimethods, which IIRC we do, we could just declare our methods to be dispatched on the appropriate supertype and then every subclass, or even combination of subclasses as it may be, would be required to provide an implementing method for every such multimethod. If this sounds like it completely ruins modularity by letting any foreigner and his module impose an implementation requirement on my type, I think that's because it does -- hence default cases and such.

So: how do we specify, in a modular and type-safe way, that the method-case trees (see Millstein) of certain multimethods must overlap, or, put another way, that certain types must implement method-cases not only of one multimethod but of several?

By Eli Sennesh at Fri, 2012-01-06 17:52 | login or register to post comments

So: how do we specify, in a

So: how do we specify, in a modular and type-safe way, that the method-case trees (see Millstein) of certain multimethods must overlap, or, put another way, that certain types must implement method-cases not only of one multimethod but of several?

This is a good question I sumbled across again this morning, but still didn't have an answer. Fortuitously, I just came across a paper which does address this in one way: A Second Look at Overloading.

Think of it like type classes, but without the class declarations. You instead specify the overloaded operations by name. This obviously implies more verbose function types, since you have overloaded name constraints instead of class constraints, but this does address the question above: to ensure a type implements several multimethods, just explicitly declare it on at least one function signature.

That the used mulimethods don't appear in function signatures is where the problem really lies.

By naasking at Wed, 2014-04-23 16:10 | login or register to post comments

Very interesting

I really like his ideas. Targeting a specific compiler seems like a good way to avoid the complexity of hardware and still get tremendous performance. The idea of features on by default, yet avoidable seems like the right middle point. And making sure that code compiles efficiently seems right. I can't wait to see benchmarks and sample comparison code, to see if this approach is actually working.

Good job Eli!

By JeffB at Tue, 2012-01-03 18:59 | login or register to post comments

Nice work

Nice work Eli. I had a couple of questions.

Because you are targeting LLVM, you are accepting a bit of platform-dependence in basic operations such as the overflow behavior of integer arithmetic, shifts, etc. How does this play out in the semantics of Deca? You explicitly do not have pointer arithmetic in Deca; how would you write, e.g. a GC (which does a lot of pointer arithmetic)?

You mention that you don't need much of a runtime system, yet you do have lambdas and closures. How does the runtime system hook up to these (allocation sites)?

By Ben L. Titzer at Wed, 2012-01-04 03:15 | login or register to post comments

TINM -- There Is No Malloc

I would write a GC in Deca by casting back and forth between pointers and integers to do the arithmetic.

I had been planning for the semantics of Deca to follow C semantics, but with a Pascal-like compiler switch for adding exceptions at things like integer overflows. Unfortunately, the platform-dependence here is pretty thorny: I can guarantee platform-independent semantics either by tying Deca to an "imaginary" platform like LLVM, or by carefully reimplementing the desired semantics by hand in custom-built code generator backends.

There is no runtime system that hooks up to lambdas and closures. Closures are existential packages. Existential packages are allocated as blobs of bytes, the least-bad of several different pretty bad options for how to handle them without garbage collection.

The one upside to doing existentials that way is that any pointers captured by the existential have their regions "leaked" into the existential signature, making it statically impossible to pass data up the stack beyond where it remains validly accessible.

Cold comfort to everyone used to dynamically-allocated, garbage-collected closures, though.

By Eli Sennesh at Thu, 2012-01-05 03:26 | login or register to post comments

I can guarantee

I can guarantee platform-independent semantics either by tying Deca to an "imaginary" platform like LLVM

You have to be careful there though, because LLVM bitcode is not entirely platform independent . Depends entirely on how you handle platform word size.

By naasking at Thu, 2012-01-05 16:33 | login or register to post comments

High-Level and system oriented : Lisaac

There was a language which tried to be high-level and system oriented : Lisaac.

Lisaac is prototype based object language made to code an operating system (IsaacOS).
For system programming, one can declare a 'Section Mapping' in which you can declare variable which are 'mapped' to an hardware structure, respecting to the order.

You can find some code here : http://gitorious.org/lisaac

By Pierre-Alexandre Voye at Mon, 2012-01-23 10:21 | login or register to post comments

User login

Navigation

Deca, an LtU-friendly bare metal systems programming language

Comment viewing options

Browse archives

Active forum topics

New forum topics

Recent comments