Application-specific foreign-interface generation

Application-specific foreign-interface generation, John Reppy and Chunyan Song, October 2006.

A foreign interface (FI) mechanism to support interoperability with libraries written in other languages (especially C) is an important feature in most high-level language implementations. Such FI mechanisms provide a Foreign Function Interface (FFI) for the high-level language to call C functions and marshaling and unmarshaling mechanisms to support conversion between the high-level and C data representations. Often, systems provide tools to automate the generation of FIs, but these tools typically lock the user into a specific model of interoperability. It is our belief that the policy used to craft the mapping between the high-level language and C should be distinct from the underlying mechanism used to implement the mapping.

In this paper, we describe a FI generation tool, called FIG (for Foreign Interface Generator) that embodies a new approach to the problem of generating foreign interfaces for high-level languages. FIG takes as input raw C header files plus a declarative script that specifies the generation of the foreign interface from the header file. The script sets the policy for the translation, which allows the user to tailor the resulting FI to his or her application. We call this approach application-specific foreign-interface generation. The scripting language uses rewriting strategies as its execution model. The other major feature of the scripting language is a novel notion of composable typemaps that describe the mapping between high-level and low-level types.

FFIs are a perennial engineering problem, and it's very nice to see progress being made on automating what's automatable about building interfaces. Their interface specification language is built from two little DSLs. The first one is a language that for specifying how to map low level types to high level types, and the second one is a rewriting-based language for translating API functions, which makes use of the type mapping programs you defined earlier. The whole thing is quite pretty, and seems to read very well.

An interesting gimme for you stack-language fans: the DSL that Reppy and Song use to specify type mappings from low-level to high-level types is a combinator-based language that reads a bit like Forth or Postscript.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

FIG & s48

A very interesting paper.

neelk wrote: An interesting gimme for you stack-language fans: the DSL that Reppy and Song use to specify type mappings from low-level to high-level types is a combinator-based language that reads a bit like Forth or Postscript.

The FIG code in the article also looks a little like the prescheme-implemented bytecode operators that you get if you inspect scheme48 code, which makes me think. I'm guessing that it would be possible to port these ideas across to scheme48 without too much difficulty. If I grasp the paper correctly (I've just skimmed it):

  • Invoking FIG could be specified in the configuration files as a package specified with new configuration forms, which would point to the MBI files in question, and which should make them available as one or more structures;
  • The conversion combinators could be implemented in prescheme as new bytecodes;
  • The library would then be visible to other modules through the structures, which would simply invoke the code specified by the conversion combinators.

What's not clear to me is how this helps. Would such scheme code be able to share MBI files created for SML code? How does FIG help with GC boundary issues, especially where the C code we are linking to implements some sort of GC?

If I understand

If I understand the paper correctly, the MBI files are Moby interface files, and contain type signatures for the functions the module exports, as well as the implementations of inline-able functions (such as conversions). There are FIG backends to support Moby and SML, and for Scheme you'd need to write another one.

GC is an interesting question. As I understand it, the NLFFI/Charon approach doesn't do *any* garbage collection of C objects -- basically it lets you manipulate C objects from the high-level language, and if you need to free them you need to call the high-level mapping of free() yourself. If your language has support for finalizers and weak references, then you can use that to implement the custom interaction of the C code and your own language, and call those functions in the translations your FIG code generates.

What does the backend do?

Thanks. Your wrote: There are FIG backends to support Moby and SML, and for Scheme you'd need to write another one.

If the backend doesn't support GC, it doesn't sound like there's all that much for it to do. Maybe it sanity checks calls through the interface?