LtU Forum

online literature on CPL?

Hello all!

Is there available online (and free) literature on CPL? More specifically, I am interested in
`D.W. Barron, J.N. Buxton, D.F. Hartley, E. Nixon, C. Strachey. The main features of CPL. The Computer Journal, Vol.6, #2, pp.134-143 (1963)'
and
`D.W. Baron, C. Strachey. Programming. In L. Fox, ed. Advances in Programming and Non-numerical Computation, pp.49-82. Pergammon Press, 1966'.
So far, the only source I was able to find is this volume in commemoration of Chrostopher Strachey, but even there they seem to have pulled off the full texts already.

Thank you in advance,
Boyko

Lexical structure of scripting languages

hi all,
I'm a newbie here but I assure you I'm not a troll. I was reading about Pluvo and then it struck me...

so you are bored and want to write a new language. what do you do at first? plan a look and feel. well, different people come from different backgrounds and they don't like the same look and feel. but then, it basically means you write a lexer, a parser, and then finally the interpreter. too much work. and it's not reusable by any other bored programmer. it's fun doing it again, of course. :)

I was thinking, what about having a rough lexical standard for interpreted languages? the point is that with clear specification everyone can write a lexer/parser and then extend it. hopefully, the base will already be there.

of course, different languages are different...
but I think some guidelines are absolutely necessary. not for the expert programmers, they can cope with anything, but the objective is to reduce the learning time for students.
so here are some imaginary rules. each idea is independent of the others. if you like them, that would probably convince you that it is indeed useful to do something like this.

  • source codes are XML files. there are four different types of text: documentation, annotations, code and comments. an IDE is a must so that full advantage of this formulation can be taken. the basic principle is that source code itself is valuable data. something like:
    <source language = "imaginary" target = "2.1">
    <documentation>
      The classical "Hello World" program
    </documentation>
    <code>
      <annotation visibility = "public" />
      <comment>the exact syntax is not relevant here</comment>
      say "Hello World"
    </code>
    </source>
    </documenation>
    

    too verbose for text editors, but not so for an IDE. parsing XML is easy. displaying different things in different colors is.. well.. depends on you. but text editors like gedit, emacs, jEdit, Notepad++, know what to do.

  • the encoding is always Unicode. this basically means you can use mathematical symbols as part of your language. advantage? well, it's been like 40 years or so of ASCII based programming, right?
  • there should be an all-pervasive naming convention. either camelCase or if the language permits, hyphen-seperated-words. but not both.
  • long names are preferred. so no more MVar or mapM_ or __init__ and scoping rules apply. Monad.map looks pretty (over mapM). scaring off uni students is not going to do any good.
  • the languages should have a clear-cut way of communicating with the outside world. the names exported should not contain any eccentricities (should be pure ([alpha][alpha|num]*)).
  • for variable names (or function names), same name with different annotations should not have different meaning (I don't hate Perl, but I think @list and $list having completely different meanings is not healthy).
  • there will be no way to write two different statements on the same line (like ; separated ones in C). block structure dictated by indentation. this suggestion brings up the controvertial idea of 'good taste' again. I think once someone implements a very good library for dealing with one particular standard like these, that library, hopefully very well documented, people will accept the standard just the way it is. may be they will demand minor tweakings here and there, but not more than that. psychology is a really really mysterious thing.

things like these... the problem with interpreted language is that they are very concise, in my humble opinion, too concise. of course, diving any deeper will cause so much difference in opinions that I should probably stop here.

High-Level Nondeterministic Abstractions in C++

High-Level Nondeterministic Abstractions in C++, L. Michel, A. See, and P. Van Hentenryck.

This paper presents high-level abstractions for nondeterministic search in C++ which provide the counterpart to advanced features found in recent constraint languages. The abstractions have several benefits: they explicitly highlight the nondeterministic nature of the code, provide a natural iterative style, simplify debugging, and are efficiently implementable using macros and continuations. Their efficiency is demonstrated by comparing their performance with the C++ library Gecode, both for programming search procedures and search engines.

Although implementing continuations in C(++) with setjmp/longjmp is not new, this paper shows a nice application of the technique.

2006 ICFP Programming Contest registration opens

Language lovers:

Registration is now open for the 9th Annual ICFP Programming Contest!

http://icfpcontest.org

The contest, associated with the International Conference on Functional Programming, will be held on the weekend of July 21-24. The contest task will be released at noon EDT on Friday, and entries will be accepted until noon EDT on Monday. Registration is free and open to all. Teams may participate from any location, and may use any programming language(s). Last year, 360 participants formed 161 teams from 26 countries.

Prize money totaling $1750 US will be awarded to help defray the costs of travel to the conference for the winners and for small cash prizes. In addition, the winners of the contest will receive bragging rights for the programming language of their choice. This makes the contest a popular avenue for demonstrating the superiority of a favorite language, or for exercising an experimental tool.

Though the specifics are secret until the contest begins, we promise that this year's task will be very different from past competitions. This year's theme is "computational archaeolinguistics."

Stay tuned for more information as the contest approaches!

- 2006 Contest Organizers
CMU Principles of Programming Group
icfpcontest-organizers@lists.andrew.cmu.edu

A new implementation of recursive-descent parsing?

It is well known that the classic way to model a recursive-descent parser is to write (or automatically produce) a set of mutually-recursive functions either by writing procedures or combining parser modules using a DSL (like in Boost::Spirit) or using templates/generics (for those languages that support such a concept).

All the different types of implementations work in a similar manner: each procedure invokes one or more procedures. This approach has some drawbacks:

  1. the stack can easily overflow, especially when parsing deeply-nested rules.
  2. some of the results may be thrown away, making parsing slow.
  3. left-recursive grammars can not be handled.
  4. backtracking has a cost.

While trying to approach the issue of RD parsers from a different perspective, I think I have found an implementation which solves some of the problems mentioned above. I hope I am not repeating someone else's work (a brief search over the internet found no similar approaches).

The idea is that instead of the input data travelling down the tree of parsers, each parser processes the current input and then leaves a continuation on a stack. The main parse engine pops the continuation off that stack and executes it; the new continuation puts another state on the stack etc. Thus an RD parser becomes a sort of state machine.

Generally an RD parser consists of the following operations:

  • sequence; parsers are sequentially invoked to parse the input
  • choice; a parser is invoked when another parser fails.
  • loop; a parser that is previously invoked is invoked again.

The traditional approach handles these operations in the following ways:

  • sequence: formed 'naturally' by placing one statement after the other or placing parser objects in a list.
  • choice: 'if' statements are used to select another possible branch.
  • loop: a parsing procedure is invoked again with new parameters.

For example (in pseudo-Java):

class sequence_parser implements parser {
    list parsers;
    iterator parse(iterator input) {
        foreach(p, parser) {
            input = p.parse(input);
        }
        return input;
    }
}

class choice_parser implements parser {
    list parsers;
    iterator parse(iterator input) {
        foreach(p, parser) {
            iterator output = p.parse(input);
            if (output != null) return output;
        }
        return null;
    }
}

class loop_parser implements parser {
    parser other;
    iterator parse(iterator input) {
        input = other.parse(input);
        return parse(input);
    }
}

The new approach handles these tasks differently. The approach uses the following context for parsing:

  1. a stack of parsers that is used for sequences
  2. a stack of parsers that is used for choices
  3. an iterator over the input
  4. a stack of saved contexts used for backtracking

Parsing operations are handled like this:

  • sequence: a new parser is pushed onto the sequence stack.
  • choice: the altenative branch parser is pushed on the choice stack, then the first branch of the choice is invoked.
  • loop: the parser pushes itself on the sequence stack, then invokes another parser.

For example:

class sequence_parser implements parser {
    parser sequence_next;
    parser this_parser;
    iterator parse(context c) {
        c.sequence_stack.push(sequence_next);
        return this_parser.parse(c);
    }
}

class choice_parser implements parser {
    parser choice_next;
    parser this_parser;
    iterator parse(context c) {
        c.save();
        c.choice_stack.push(choice_next);
        return this_parser.parse(c);
    }
}

class loop_parser implements parser {
    parser this_parser;
    iterator parse(iterator input) {
        c.sequence_stack.push(this);
        return this_parser.parse(c);
    }
}

The parsing algorithm then becomes a very simple loop:

void parse(context c) {
    while (true) {
        parser = c.sequence_stack.pop();
        if (parser != null) {
            iterator output = parser.parse(c);
            if (output == null) {
                parser = c.choice_stack.pop();
                if (parser) {
                    c.restore();
                    c.sequence_stack.push(parser);
                }
                else throw parse_exception();
            }
        }
        else (if c.input_exhausted()) {
            return;
        else {
            throw parse_exception();
        }
    }
}

The above does the following:

  1. it pops a parser from the sequence parser stack.
  2. if the sequence parser stack is empty, then if input is exhausted then parsing succeeded, lese parsing failed.
  3. if the sequence parser stack is not empty, then the parser object is invoked to parse the input; it is passed the context.
  4. if the parser succeeded, then the loop continues
  5. otherwise a parser is popped from the choice stack and
  6. a context state is popped from the stack and activated.
  7. The parsing loop continues if the choice parser is not empty.

This approach solves some of the problems of the traditional approach in the following ways:

  1. stack overflow becomes an impossible task because user-defined stacks that can automatically grow is used; the practical limit is the process' memory limit. The user-defined stacks do not grow very large because the main parsing loop continuously pops items from them and new entries are pushed on the stack on a need basis only.
  2. backtracking becomes a constant-time operation no matter how deep the parser has proceeded, because a parser state is popped from the stack. A context save is composed of the sequence & choice parser stack position and the input position (it can be down to 12 bytes if three 32-bit integers are used).

Amusing question

Take a look at this crooked timber thread. How well does this phenomenon apply to our field? Why did you decide PLT is cool? Was it because of the writing from a specific school of thought (let's include papers as well as books)? Did they give you the right impression of the field?

Introduction to Concurrent Programming with Stackless Python

Grant Olson has a writeup of concurrent programming with Stackless Python. There is a great deal of actual code in the article.

Obviously concurrent programming can't be mentioned without mentioning Mozart-Oz (which recently went into version 1.3.2), Erlang and AliceML. Frankly, despite my attempts, I don't understand Haskell's FRP enough to know if it fits here as well.

(while searching archives of LtU for previous interesting mentions of concurrent programming, I came accross a post about SuperGlue...note the first message containing PVR's mention of FRP)

Oxymoronic? "Safety-critical development guidelines for real-time Java"

Came across the PERC Raven Java PDF advertising stuff while looking for other real-time GC implementations. I'm not sure I want Java near my nuclear reactors, but food for thought never-the-less.

(Scroll down to the second page of the PDF to see the guidelines.)

Article: Exploring Cocoa with F-Script

I've made available a new article that shows how it is possible to interactively explore and script Objective-C objects on Mac OS X, using F-Script, an open source scripting language. The article describes the object browser and other interactive modules tailored for the exploration of objects at runtime.

The article is at http://www.fscript.org/documentation/ExploringCocoaWithFScript/index.htm

Nanopass Compiler Framework

While not directly related to PL theory, I consider compiler construction a close relative. This set of papers describes Indiana University's compiler coursework. The key to these papers is "very small source-to-source transformations", remaining executable between passes for pedagogy and verification. Scheme is used as the source and target language because its s-expression syntax requires no 'marshaling' to and from human-readable syntax and AST structures. Later stages of the compiler include annotations directly in the source as quoted, unevaluated lists.

There is also a Summer Scheme Workshop (with complete compiler code) demonstrating some of these ideas. If anyone finds code for the Nanopass or Micropass compiler, please let me know. I'm interested in studying them as well.

A compiler structured as a small number of monolithic passes is difficult to understand and difficult to maintain. ... An attractive alternative is to structure a compiler as a collection of many fine-grained passes, each of which performs a single task. This structure aligns the implementation of a compiler with its logical organization, simplifying development, testing, and debugging.

[1] Compiler Construction Using Scheme Hilsdale, Ashley, Dybvig, and Friedman.
[2] Compiling: A high-level introduction using Scheme Haynes
[3] A Nanopass Framework for Compiler Education Sarkar, Waddell, Dybvig

Ancillary:
[4] Destination Driven Code Generation
[5] Summer Scheme Workshop; Compiling Scheme

XML feed