LtU Forum

Some notes on Rust, the language.

Nobody seems be saying much about Rust, or if they are, the LtU search can't find it. So I'm starting a Rust topic.

After some initial use of Mozilla's "Rust" language, a few comments. I'm assumming here some general familiarity with the language.

The ownership system, which is the big innovation, seems usable. The options are single ownership with compile-time checking, or multiple ownership with reference counts. The latter is available in both plain and concurrency-locked form, and you're not allowed to share unlocked data across threads. This part of the language seems to work reasonably well. Some data structures, such as trees with backlinks, do not map well to this model.

There's a temptation to resort to the "unsafe" feature to bypass the single-use compile time checking system. I've seen this in some programs ported from other languages. In particular, allocating a new object and returning a reference to it it from a function is common in C++ but difficult in Rust, because the function doing the allocation doesn't know the expected lifetime of what it returns. It takes some advance planning to live within the single-use paradigm.

Rust's type system is more troublesome. Rust's has most of the bells and whistles of C++, plus some new ones. It's clever, well thought out, sound, and bulky. Declarations are comparable in wordiness to C++. Rust has very powerful compile-time programming; there's a regular expression compiler that runs at compile time. I'm concerned that Rust is starting out at the cruft level it took C++ 20 years to achieve. I shudder to think of what things will be like once the Boost crowd discovers Rust.

The lack of exception handing in Rust forces program design into a form where many functions return "Result" or "Some", which are generic enumeration/variant record types. These must be instantiated with the actual return type. As a result, a rather high percentage of functions in Rust seem to involve generics. There are some rather tortured functional programming forms used to handle errors, such as ".and_then(lambda)". Doing N things in succession, each of which can generate an error, is either verbose (match statement) or obscure ("and_then()"). You get to pick. Or you can just use ".unwrap()", which extracts the value from a Some form and makes a failure fatal. It's tempting to over-use that. There's a macro called "try!(e)", which, if e returns a None value, returns from the enclosing function via a return you can't see in the source code. Such hidden returns are troubling.

All lambdas are closures (this may change), and closures are not plain functions. They can only be passed to functions which accept suitable generic parameters. This is because the closure lifetime has to be decided at compile time. Rust has to do a lot of things in somewhat painful ways because the underlying memory model is quite simple. This is one of those things which will confuse programmers coming from garbage-collected languages. Rust will catch their errors, and the compiler diagnostics are quite good. Rust may exceed the pain threshold of some programmers, though.

Despite the claims in the Rust pre-alpha announcement of lanaguage definition stability, the language changes enough every week or so to break existing programs. This is a classic Mozilla problem; that organization has a track record of deprecating old features in Firefox before the replacement new feature works properly.

Despite all this, Rust is going to be a very important language, because it solves the three big problems of C/C++ that causes crashes and buffer overflows. The three big problems in C/C++ memory management are "How big is it?", "Who owns and deletes it?", and "Who locks it?". C/C++ deals with none of those problems effectively. Rust deals with all of them, without introducing garbage collection or extensive run-time processing. This is a significant advance.

I just hope the Rust crowd doesn't screw up. We need a language like this.

Computing by deltas?

Say we have a program: F -> G. Under what situations can we generate a program which operates on *deltas* of F and G?

That is, say that dF is an efficiently stored delta value between an old and new value of F, and dG is a delta between values of G. Are there many situations where we can derive the program dF -> dG given F -> G ?

It seems like this is probably a topic with existing research, but I'm having trouble figuring out what to search for. This would be interesting in the context of a framework like React.js, where there is user code that modifies the user-interface state based on external events, then more user code which translates that state into a virtual DOM value, then finally the real DOM is modified using the virtual DOM. If we were able to derive a delta-based version of the user's code, then we could directly translate external events into real DOM manipulation, without generating an entire virtual DOM along the way.

Negation in Logic Languages

This is a follow up from my earlier post (I will try and put a link here) about negation in Prolog. The question was if negation is necessary, and I now have some further thoughts (none of this is new, but new to me). I can now see my earlier approach to implementing 'member of' relies on evaluation order, which now I have moved to iterative-deepening no longer works. So I need a way of introducing negation that is logically sound. Negation as failure leads to unsoundness with variables. Negation as refutation seems to require three valued logic, and negation as inconsistency requires an additional set of 'negative' goals which complicates the use of negation. The most promising approach I have found is to convert negation into inequalities, and propagate 'not equal' as a constraint in the same way as CLP. This eliminates negation in goals in favour of equality and disequality. I am interested if anyone has any experience of treating negation in this way, or any further thoughts about negation in logic programming or logical frameworks.

Iterative Functional Reactive Programming with the Nu Game Engine - An Informal Experience Report

This is an experience report on the design and usage of the Nu Game Engine, a purely-functional 2D game engine written in F# -

Iterative Functional Reactive Programming with the Nu Game Engine

Synopsis

The Nu Game Engine certainly qualifies as 'functional reactive' in that -
1) it is reactive in that it uses user-defined events to derive successive simulation states.
2) it is functional in that is uses pure functions everywhere, even in the event system.

However, I cannot describe it as the typical first-order or higher-order FRP system since it uses neither continuous nor discrete functions explicitly parameterized with time. Instead it uses a classically-iterative approach to advancing game states that is akin to the imperative tick-based style, but implemented with pure functions.

Thus, I gave it the name of 'Iterative FRP'. It's a purely-functional half-step between classic imperative game programming and first / higher-order FRP systems. This document discusses the major plusses and minuses to using this 'iterative' FRP style.

An aside about the document -

I released an earlier version of this PDF along with an earlier version of the engine, but had to make significant revisions due to a fundamental flaw in the engine's previous design, and thus removed it.

Hopefully this newer version will inform us about the properties of this 'iterative' style of FRP in contrast to the known styles.

return-type polymorphism of monads done right in a dynamic language

While monads can be encoded in dynamically-typed language, the usual encodings are usually uglier than monads in type-class-based language because type inference is not there to help determine which monad instance we're talking about. Some use-cases of type inference can be replaced by dynamic type dispatch, but not the so-called "return type polymorphism" where it is the type expected by the context that matters, not the type carried by values. Monads are just one instance where return-type polymorphism matters (in the form of "return : forall a, a -> m a"), but this is a general problem with other structures (typically structures with a default element such as monoids/groups).

Intuitively it seems that this problem would be solveable by keeping, in the dynamic type representation, a "not known yet" marker, and using the free monad operations in this case. But there is a gap from intuition from usable code, and Tony Garnock-Jones just wrote a very nice blog post exposing this technique in Racket:

Monads in Dynamically-Typed Languages

(run-io (do (mdisplay "Enter a number: ")
            n <- mread
            all-n <- (return (for/list [(i n)] i))
            evens <- (return (do i <- all-n
                                 #:guard (even? i)
                                 (return i)))
            (return evens)))

How can middle school algebra help with domain specific languages?

As a programmer with no training in type theory or abstract math (beyond what I've learned on LtU), I'm curious to know if the laws of basic, middle school, algebra can help design or understand better domain specific languages.

Recently I saw a presentation on how algebraic data types don't just contain the name 'algebra,' but actually allow many of the same operations as on simple numbers. This will be obvious to people on this forum but such ideas are very surprising to programmers at large. The link describes not just products and sums, but also x to the power of y style operations. The youtube presentation in the link above then talks about how one can even take derivatives of type expressions, which result in very useful objects, such as zippers.

There have been interesting articles[pdf] on derivatives of regular expressions. Again, kleene algebra seems to define basic operations, +, *, etc. Although the derivative part is beyond me.

Modern database systems, are based on relational algebra, and this algebra beautifully simple. Although I have't seen this algebra described specifically using basic arithmetic operators, it does contain cartesian product, sums and even division.

So many of these, very practical, tools programmers use every day are based on what looks similar to school algebra (they even contain the word algebra in their name). Where can I learn more about how to design DSLs using algebraic operators? My assumption is that by starting with a domain, and being forced to come up with interpretations of +, -, *, -, etc. (interpretations which follow specific rules), I'll end up with a pretty minimal, yet expressive, language.

Finally, I actually started thinking about this after implementing Peyton-Jones & Eber's "Composing Contracts." That language, describes a couple of basic types: Contracts and Observables. Then describes operations to on these types: And, Or, Give, Scale, which map pretty easily to +, *, negate, etc. Once again, a beautifully simple language can describe a complex set of financial instruments. What's more, the constructs of this language (like so many others), seem very closely related to school algebra! What might it mean to take the derivative of a contract? What might it mean to take it's square or square root??

symbols and loosely coupled concurrent apps part II

This discussion is a continuation of managing closed worlds of symbols via alpha-renaming in loosely coupled concurrent apps where a number of different topics are touched, most recently a Lisp sub-thread at the end in addition to a bunch of stuff I was posting about lightweight user-space process architecture involving threads and fibers. When that discussion gets less useful due to sheer size, I'll continue here when I have anything novel to say, which probably won't be that much. You're welcome to continue here anything started there, too.

(A number of things I said in the other discussion might be expected to bother folks who want to see fewer development systems around, with less competition, fewer languages, and narrower choices due to existing "winners" who will guide us to a managed promised-land where we can be tenants for a modest fee. Is that enough hyperbole? I can't be bothered to get worked up about it, but I see many things as subject to change, especially along lines of choice in portable code for app organization not dictated by trending vendors. I expect to mostly ignore politics and advocacy.)

The first thing I want to continue is discussion of performance among options for implementing fibers (single-threaded user-space control flows that multiplex one OS thread) because I like heap-based spaghetti stacks, while others recommend options I think are poor choices, but to each his own.

How can languages help us in terms of achieving correct program design?

We talk a lot about how languages can help us with correctness of program execution, but why don't we talk about how languages help us with correctness of program design?

With all the rework that goes into making a non-trivial system well-designed, I'm bummed this subject does not receive much treatment - especially in terms that non-PhD's like me can understand.

Any thoughts?

Function Readability & Understandability

Two versions of the same function in JavaScript because its a language that supports both the functional and imperative style. The function accepts any number of arrays as arguments, and outputs the permutations of the values so (['a', 'b'],['1', '2']) becomes [['a', '1'], ['a', '2'], ['b', '1'], ['b', '2']]:

function permutations1(x) {
    return x.reduce(function(a0, v0) {
        return a0.reduce(function(a1, v1) {
            return a1.concat(v0.map(function(v2) {
                return v1.concat([v2]);
            }));
        }, []);
    }, [[]]);
}

function permutations2(x) {
    var a = [[]];
    for (var i = 0; i < x.length; ++i) {
        var b = [];
        for (var j = 0; j < a.length; ++j) {
            for (var k = 0; k < x[i].length; ++k) {
                b.push(a[j].concat([x[i][k]]));
            }
        }
        a = b;
    }
    return a;
}

Which do you find easier to read and understand, and if you find one easier, what do you think the reasons are?

Snakes all the way down

Virtual machines (VMs) emulating hardware devices are gen- erally implemented in low-level languages for performance reasons. This results in unmaintainable systems that are difficult to understand. In this paper we report on our experience using the PyPy toolchain to improve the portability and reduce the complexity of whole-system VM implementations. As a case study we implement a VM prototype for a Nintendo Game Boy, called PyGirl , in which the high-level model is separated from low-level VM implementation issues. We shed light on the process of refactoring from a low-level VM implementation in Java to a high-level model in RPython. We show that our whole-system VM written with PyPy is significantly less complex than standard imple- mentations, without substantial loss in performance.
* We show how the execution and implementation details of WSVMs are separated in the same way as those of HLLVMs.
* We show how the use of preprocessing-time meta-programming minimizes the code and decreases the complexity.
* We provide a sample implementation of a WSVM prototype for PyPy which exhibits a simplified implementation without substantial loss of performance (about 40% compared to a similar WSVM in Java).
(groan, since when did Java become the gold standard for "fast"? I know, I know, "with a sufficiently advanced JIT...") (I (sincerely, seriously) apologize if this is not nearly theoretical enough for a forum like LtU.)
XML feed