archives

Transpiling a dynamically typed language to a statically typed language

Hi, I have been pondering what I want to do for my master
thesis (2+ years away) and I have an idea. I am writing to ask if
the question has a known answer or is even theoretically possible
to conduct.

My idea is to write a source-to-source compiler from a Lisp_2 to
Rust, but I do not know if it is possible to transpile from a
dynamic language to a statically typed language.

Below is a rough draft for the introductory chapter. Beware that
my writing style implies that I already know that producing such
a transpiler is possible, but I do not.

Kind regards,
Filip Allberg

A Tale of Two Languages

Imbuing Lisp with the thread-safety guarantees of Rust by using a source-to-source compiler.

Introduction

This short introductory chapter starts by presenting the question we are trying to answer, followed by a summary of our suggested solution that provides a definitive answer to the our question. We end this chapter ends with an overview of the subsequent chapters.

The question

By compiling a language X into a target language Y through the use of a source-to-source compiler (also known as a transpiler) the executable program (created by an additional compilation step) inherits certain qualities from the language Y, particularily its execution model.

Transpiling X to a target language Y may put restraints on the definition of X and possibly have a negative impact on the runtime environment of the final executable certain desireable traits may be inherited from Y, for an example cross-platform portability.

A commonly selected target language for a transpiler is the programming language "C". Since C is available on practically any machine, the C-code that is generated from a program written in X is likely to be portable as well.

Moreover, any optimizations that the C-compiler can achieve are automatically and implicitly available to the source language X. C-compilers excel at carrying out a great many optimizations with respect to allocating registers, laying out code, and choosing modes of address. As designers of the source language we are liberated of such issues.

In the case of C there are numerous pain-points such as pointers running amok and its lack of built-in garbage collection. If the source-language is a garbage-collected language, that is type-safe, such problems have to be addressed in the source-to-source compiler.

Furthermore there are certain limits imposed on our source language by the limitations of C has, such as a maximum of 32 arguments in function calls or less than 16 levels of lexical blocks.

Inherently, each source language presents its own set of advantages and negative consequences with respect to the source language.

The Rust programming language, which is a systems programming language that is a contender in the same domain as C and C++, has the desireable attribute of guaranteeing thread-safety and threads without dataraces.

The question then is, can a specific source language compiled into Rust be unequivocally determined to be thread-safe through the merits of the source-to-source compiler alone?

The solution and answer

In this text we present a source-to-source compiler from a Lisp2 dialect to Rust whereby we can, through the use of the Rust compiler rustc, compile the generated code. Using the generated programs, we can conclude, with the addition of formal theoretical deductions, that valid programs written in the aforementioned Lisp dialect are thread-safe and provides threads without data-races.

Overview

Since there is no additional body of work this subsection cannot be fully written, but a rough layout of the remainder of the text would probably look like:

Chapter 1. An overview of Lisp and a formal description of our dialect as well as an overview of our concurrency model (not sure if the latter is necessary)

Chapter 2. A precursory introduction to Rust with references to papers demonstrating the validity of the claims that Rust provides thread safety guarantees and data race guarantees as this is a corner stone to our work.

Chapter 3. Examples from concrete programs written in the Lisp dialect that will serve as benchmarks when evaluating the code generated by our compiler.

Chapter 4. An overview of the implementation of the compiler that also discuss alternative implementation choices and why they were not chosen

Chapter 5-(maybe several chapters): Explanations describing the parts of the different compilation phases. It is probable that we will encounter some automatas here.

Chapter 6: Code generation optimizations and translation strategies to portable Rust code.

Chapter 7: Evaluation of benchmarks

Chapter 8: Discussions of possible applications, such as tiering source-to-source compilers in a layered fashion to successively imbue desireable traits from each language in the compilation sequence to the top-most source language.

Appendices: Complete source code for the compiler