archives

Feather: A Heapless Functional Programming Language

Hi all!

I'm busy designing my third language (or, redesigning the first, depending on how you look at it). I changed Feather to be a heapless functional programming language. Primarily, this means there is no need for a garbage collector. Programs for embedded devices that have real-time constraints stand to benefit the most from this. With any luck, we could get a subset to compile to some GPUs.

Here's the design as it is. Note that I use underscores for tabs whitespace due to forum restrictions -


The Feather Programming Language is a heapless functional programming language for real-time software, especially on embedded devices. It is a compiled language with a Hindley-Milner type-inferencing algorithm and strict function argument evaluation.

Feather’s lack of heap is accomplished by extending the end frame pointer for previous stack frames whose return slot uses the data of the successive frames that the end frame pointer is extended over. This elides the need for any type of garbage collection. The cost is that a bunch of dead ‘popped’ variables also get extended over, but this shouldn’t be a big problem in practice when using the techniques described next. Feather can also return closures, but that means the end frame pointer must be extended over the closed-over values, so they must be used with caution.

Non-consable return values are placed at the beginning of the next stack frame and the previous end frame pointer is extended over it.

Consable return values like lists and maps are treated with more care. When a consable value is to be returned, 8 to 32 slots are allocated at the beginning of the new stack frame, and any conses that take place on the consable return value are placed in one of those slots. (In fact, there may need to be a way to specify what the beginning slot count should be for a given function call.) Further, any conses on those conses will also be placed in one of those slots. If the return slots are finally filled, then n * 4 new slots are allocated past the last stack variable for further use, and so on. An analyzer could be hooked in at run-time to track the average and max allocations needed at each call site to eliminate most needs to do secondary (and so on) allocations for consable return values.

There may also be a special cons that allows temporary conses to be used on consable return value that are _not_ placed in the return slots. If one of these accidentally makes it into the returned cons structure, it will be wasteful due to extending the return frame end pointer over that stack variable - but will not be terminal to the program.

With these technique, I think a minimum amount of space will be wasted on dead varables, but we'll have to see for sure. Tail-call optimizations should be possible with this set up.

There will almost certainly be some restrictions on language's expressiveness due to the heaplessness, but we'll discover and work around those as we go along. For example, I'm not sure if monads will be implementable without a heap, but the syntax is there in case we can.

Some syntax -

/* multi-line comment /* nests */*/
// line comment
/// Documentation comments are done exactly the same as in F#.

// function definition syntax uses powerful mix-fixity and repetition specification
define two as 1 + 1
define double [x] as 2 * x
define [x] times [y] as x * y
define [x] *** [y] precedence 0.5 as x * x * y * y
define if [condition] then [lazy consequent] else [lazy alternate] as condition ? consequent ?? alternate // ? and ?? are the built-in ternary operator
define let ([n] be [x] *and) in [e] with (Name n, Expr e) as…
define sqrt [x] requiring x >= 0 as… // 'requiring' specifies function contract ('ensuring' might mean post-condition).

// function signatures
asFloat :: x -> y |> (Number x, Float y)
let :: ((n, x), e) -> r |> (Name n, Expr e) // function signatures only use the major word

(map)________// partial application. Same as (x, y) ~> x map y
(xs map)_____// same as x ~> xs map x
[map]________// translation to a fully curried function. Same as x ~> y ~> x map y
(f) . (g)____// function composition

() // unit
'c' // char
"string" // string
$(1, 2, 3) // list
$[1, 2, 3] // vector
${1, 2, 3} // tuple
$(("Jim", 1), ("Bob", 2)) // association list
$[["Jim", 1], ["Bob", 2]] // dictionary
${{a, b, c}} // set

double 5
2 times 3
(x + y) * z
x = 5 // equality
x : xs // cons
x | 0xff // all characters in a hex literal must be lower-case (including the 'x' separator)
true & false // and operator (non-short-circuiting version)
true && false // and operator (short-circuiting version)
set x to 5 // x is a variable (mutable)
set x to $(1..5)
set x : xs to $(1, 2, 3)
aVector !! 5 // look up 5th element
lazy 1 + 1 // use lazy evaluation. NOTE: not sure how well this will work in a heapless language…

(x, y) ~> x * x - y
steps print 1 then print 7
if 3 let x be 10 and y be 13 in x + y
$(1, 2, 3) map (+ 3) filter (isOdd) // because of white space, (+ 3) != (+3)
for x from someList and y from $(1..5) and where y isOdd then lift x * y

// recommended indentation
let
____x be 10 and
____y be 88 and
____z be 12 in
____x + y * z

// sequential expression
steps
____print 1 then
____print 7 then
____print "Hi!"

// choice expression
choose
____when x = 5 then print 22
____when x >= 5 then print 33
____when x ____else print 55

// simple match
match m
____with x : xs then x
____with ${x, x, _} then x
____else m

// multi-match
match m and n
____with x : xs and x then m
____else n

// match chain
match m
____with x : xs where x = 5 then xs
____else $()

// restricted match (makes inferred function type more specific when using a match)
match m as Int
____with 0 then 10
____with 1 then 20
____else 100

// monad comprehension
for
____x from someList and
____y from $(1..5) and
____where y isOdd then
____lift x * y

// functional types
data...________// like Haskell data
synonym..._____// like Haskell type
derivation...__// like Haskell newtype
protocol...____// like Haskell class
instance…______// like Haskell instance

----

Error handling - Uses a system similar to scheme's Condition system.

----

Literal numbers are Int by default -
5(u)b________// (U)Byte - 8-bit
5(u)s________// (U)Short - 16-bit
5(u)_________// (U)Int - 32-bit
5(u)g________// (U)Long - 64-bit
5.0__________// Float - 32-bit
5d___________// Double - 64-bit
5(u)n________// (U)NInt – Native-size integer – varies by hardware platform.

----

Function type arrows –
-> - Describes a function of inferred, unspecified, or dependent purity
=> - Describes a pure function
:> - Describes an impure function

----

Uses of semi-colon -
fn (1, 2, 3; 'a', 'b', 'c') // ; separates repeated parameters
fn2 (1, ; 'a') // ; omits optional parameters
class Thing expose attribute Age as var UInt; print 5 // ; manually ends expr scope without a newline. Note that classes aren’t in the language anymore.

----

Miscellaneous rules -
import… // very similar to Haskell's import, but much simpler and less error prone
Module.fn // qualifies fn with module name. A.b != a . b due to whitespace (and apparently capitalization). In the expression Module.(+), the parens are required for symbolic operator qualification.
shadowed fn // calls the function that the current function is shadowing… might be a special form
rest… // the rest parameters with name rest
rule: indentation defines scope
rule: symbolic operators like + and * have specifiable precedence and associativity, textual ones do not
rule: symbolic operators take precedence over all textual operators
rule: symbolic operators can be sequenced and so can textual ones, but they cannot be sequenced together in the same function specification.
rule: 'if' is a major word since it's the beginning of a sequence and 'then' and 'else' are minor words since they're in the tail of a sequence.
rule: symbolic chars cannot be mixed with alphanumeric chars in the same word, and neither can be mixed with delimiters such as (, ), {, }, [, and ]. Sadly this means we can’t have meaningful (|banana clips|) or [|oxford brackets|].
rule: words in an ongoing sequence take precedence over a single word or words beginning a new sequence.
rule: the point where a function is injected is its application site, even if it is partially applied with no arguments, like map in the expression fn (map).
rule: purity means more than just no immediate side-effects; it means referentially transparent.
rule: expressions should generally be readable from left to right and top to bottom.
rule: there are 3 types of function purity – pure, impure, and dependent. Only higher order functions can be dependent. A dependent function resolves to pure or impure depending on how it is invoked. If one or more of its parameter functions is impure, it becomes impure. Otherwise, its purity depends on whether it calls other impure functions.
rule: a function with impure contents can be forced to be have a pure type with the pure function modifier. This allows most pure functions to be inferred, and the remaining be manually declared as such.
rule: all module, type, protocol, and type / data constructor names start with an uppercase letter. This makes pattern matching much easier to use (this is what Haskell does).
rule: the only time you need to qualify a name with the module is when it collides with another imported name.
rule: operator precedence and associativity is the same as in C, but with floating point precedence values (is that a good idea if the embedded systems might not have floating-point acceleration?)
rule: like python and its recommended 'pythonic' style philosophy, there is a similar recommended philosophy in Feather. In Feather, there is always a single best way to do any one thing, and it should be obvious. This won't always be achievable, but is a good language goal.
rule: textual operators should typically be written in an active voice.

So anyways, I'll keep designing this until I get a chance to start implementing it. I've already implemented a pure functional programming language with F# (see AmlProject at SourceForge), so I've got enough confidence to start on this one. I'm thinking I'll use LLVM to build the compiler.

Hopefully this language will finally allow us to program stuff like games for embedded devices and console in a functional style.

On the Naturalness of Software

A paper in ICSE 2012 by Abram Hindle, Earl Barr, Zhendong Su, Mark Gabel, and Prem Devanbu. Abstract:

Natural languages like English are rich, complex, and powerful. The highly creative and graceful use of languages like English and Tamil, by masters like Shakespeare and Avvaiyar, can certainly delight and inspire. But in practice, given cognitive constraints and the exigencies of daily life, most human utterances are far simpler and much more repetitive and predictable. In fact, these utterances can be very usefully modeled using modern statistical methods. This fact has led to the phenomenal success of statistical approaches to speech recognition, natural language translation, question-answering, and text mining and comprehension.

We begin with the conjecture that most software is also natural, in the sense that it is created by humans at work, with all the attendant constraints and limitations—and thus, like natural language, it is also likely to be repetitive and predictable. We then proceed to ask whether a) code can be usefully modeled by statistical language models and b) such models can be leveraged to support software engineers. Using the widely adopted n-gram model, we provide empirical evidence supportive of a positive answer to both these questions. We show that code is also very repetitive, and in fact even more so than natural languages. As an example use of the model, we have developed a simple code completion engine for Java that, despite its simplicity, already improves Eclipse’s built-in completion capability. We conclude the paper by laying out a vision for future research in this area.

More a visionary paper, but seems to be the best work I've seen so far on how to apply ML to programming.