Prior art for reifying lifecycle

Are there any prior examples of programming languages that expose the program processing lifecycle as a value or syntax element?

By lifecycle, I mean steps like the below which many languages follow (though not necessarily in order):

  1. lex: turn source files into tokens
  2. parse: parse tokens into trees
  3. gather: find more sources with external inputs
  4. link: resolve internal & external references
  5. macros: execute meta-programs and macros
  6. verify: check types, contracts, etc.
  7. compile: produce a form ready for loading
  8. run: load into a process that may be exposed to untrusted inputs

Does anyone have pointers to designs of languages that allow parts of the program to run at many of these stages *and* explicitly represent the lifecycle stage as a value or syntax element?

I'm aware of reified time in hardware description languages like Verilog and in event loop concurrent languages like JavaScript and E, but that's not what I'm after.

Background

I work in computer security engineering and run into arguments like "we can either ship code with dynamic languages that is hard to reason about the security properties of, or not ship in time."

I'm experimenting with ways to enable features like the below but without the exposure to security vulnerabilities or difficulty in bringing sound static analysis to bear that often follows:

  • dynamic loading,
  • embedded DSLs,
  • dynamic code generation & eval,
  • dynamic linking,
  • dynamic type declaration and subtype relationships & partial type declarations,
  • powerful reflective APIs

I was hoping that by allowing a high level of dynamism before untrusted inputs reach the system I could satisfy most of the use cases that motivate "greater dynamism -> greater developer productivity" while still producing static systems that are less prone to unintended changes in behavior when exposed to crafted inputs.

I was also hoping, by not having a single macros-run-now stage before runtime, to allow use cases that are difficult with hygienic macros while still allowing a module to limit how many assumptions about the language another module might break by reasoning about how early in the lifecycle it imports external modules.

The end goal would be to inform language design committees that maintain widely used languages.

cheers,
mike

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Reflective Towers of Interpreters

I wonder if "reflective languages" and the ideas of "reflective towers of interpreters" might be relevant to you: see the languages Brown (Wand & Friedman, 1986), Blond (Danvy & Malmjkaer, 1988), and Black (Asai et al., 199x; http://pllab.is.ocha.ac.jp/~asai/Black/).

Actually, googling around, readscheme.org has, as usual, a great collection of resources on the topic: http://library.readscheme.org/page11.html

Reflective Towers of Interpreters

@tonyg, Thanks much for the pointers.

It looks like Blond allows interactions in both directions between tower levels, and the Blond paper touches on turning uses of reflective operators into combinations of basic operators.

So it looks like a tower of interpreters approach might be a good way to define the interactions between modules at different stages of processing, but within a level I need some concept of lifecycle to be able to phrase guarantees like "Module M has reached stage verify so all types that it references and all types that escape it have well-defined, immutable state vectors."

micros and fexprs

Fwiw. In providing control over the translation process, I'm aware of two approaches essentially dual to each other: fexprs, which I pursued in my dissertation, and micros, which Shriram Krishnamurthi (the outside member of my dissertation committee, as happens) pursued in his dissertation. The symmetry here goes remarkably deep. From my dissertation:

Each fexpr specifies a computation directly from the (target-language) operands of the fexpr call to a final result, bypassing automatic operand evaluation and thus giving the programmer complete semantic control over computation from the target language.

[...] each micro specifies a translation directly from the (source-language) operands of the micro call to a target expression, bypassing automatic operand translation and thus giving the programmer complete syntactic control over translation from the source language. Thus, micros are to translation what fexprs are to computation. The parameters of the analogy are that micros bypass processing that would happen after a macro call, and are inherently syntactic; while fexprs bypass processing that would happen before a procedure call, and are inherently semantic.

The analogy also extends to internal mechanics of the devices: micros, as treated in [Kr01], rely heavily on a function dispatch that explicitly performs source translations, compensating for the loss of automatic operand translations — just as fexprs (treated here) rely heavily on a function eval that explicitly performs evaluations, compensating for the loss of automatic operand evaluations.

micros and fexprs

@JohnShutt, Thanks. So these provide ways to do syntax extensions in user code without throwing hygiene out the window?

Is your Kernel page a good place to start on fexprs?

starting fexprs

One might start from there, yes; there are links from there to other places to go. The first link on it (I'm reminded) is to the basic fexpr post on my blog. Fexprs have a really atrocious reputation as badly behaved, which I might describe as half-true and incomplete. Trying to explain them has occupied a long series of posts on my blog, the most recent in 2016. I've got some miscellaneous related links on my surfing page (over thar).

Fexprs are tricky to optimize because they're tricky to prove things about, which one might think would make them tricky from a computer security perspective. I can't speak for micros, about which most of what I know is quoted above.

Staging?

Staging?

Staging

@Ehud Lamm, as in MetaML?

et al..

et al..

Thanks, Ehud.

I'm still learning a lot of terminology.

Looks like “Anything recent happening with multi-stage programming?” is a good place to start following threads on multi-stage programming.