LtU Forum

Symbol visibility (public, private, protected, etc.)

I'd be curious to know what people's opinions are on the matter of symbol visibility, and whether there are any new and fresh ideas in this space.

Now, you might think that symbol visibility would be a rather dull and pedestrian topic for a language theorist - few research papers on language design even mention the topic. However, I think that this is actually a very interesting area of language design.

Limiting the visibility of symbols has immense practical value, especially for large code bases: the main benefit is that it simplifies reasoning about the code. If you know that a symbol can only be referenced by a limited subset of the code, it makes thinking about that symbol easier.

There are a number of languages which have, in pursuit of the goal of overall simplicity, removed visibility options - for example Go does not support "protected" visibility. I think this is exactly the wrong approach - reducing overall cognitive burden is better achieved by having a rich language for specifying access, which allows the programmer to narrowly tailor the subset of code where the symbol is visible.

Here's a real-world example of what I mean: One of the code bases I work on is a large Java library with many subpackages that are supposed to be internal to the library, but are in fact public because Java has no way to restrict visibility above the immediate package level. In fact, many of the classes have a comment at the top saying "treat as superpackage private", but there's no enforcement of this in the language.

This could easily be solved if Java had something equivalent to C++'s "friend" declaration, the subpackages could then all be made private, and declare the library as a whole to be a friend.

However, I wonder if there's something that's even better...

The Two Dualities of Computation: Negative and Fractional Types

The Two Dualities of Computation: Negative and Fractional Types

Abstract

Every functional programmer knows about sum and product types, a+b and a*b respectively. Negative and fractional types, a-b and a/b respectively, are much less known and their computational interpretation is unfamiliar and often complicated. We show that in a programming model in which information is preserved (such as the model introduced in our recent paper on Information Effects), these types have particularly natural computational interpretations. Intuitively, values of negative types are values that flow “backwards” to satisfy demands and values of fractional types are values that impose constraints on their context. The combination of these negative and fractional types enables greater flexibility in programming by breaking global invariants into local ones that can be autonomously satisfied by a subcomputation. Theoretically, these types give rise to two function spaces and to two notions of continuations, suggesting that the previously observed duality of computation conflated two orthogonal notions: an additive duality that corresponds to backtracking and a multiplicative duality that corresponds to constraint propagation.

Executable comments

So this idea just sort of crystallized in my head: what if rather than documenting code does in comments, we instead added a bunch of printfs to the code so, when necessary, the code can explain what its doing while doing it. Of course, this doesn't seem so new and novel, but what I'm suggesting is that comment syntax be replaced by printf syntax in a language design. Then, we somehow must provide a way to "turn it on/off" so that when it is off, it has zero cost, and we can selectively turn it on during debugging when we need code to explain themselves.

This should apply to library documentation also, so that methods can explain what they are doing when stepping through code (as Bret Victor demonstrates in his learnable programming essay), while the print statements should clean enough that they can also act as static documentation when necessary.

What do you all think?

Safe interior references in the presence of mutation

Suppose like Rust & C/C++, we want to support references that point to the inside of another value. A reference<t> is something that can be dereferenced (deref : reference<t> -> t). A reference may also be mutable, so that we can set its value. It may also support other operations, such as walking a data structure. Examples:

  1. Given a struct x { a:int, b:int }, we can obtain a reference y=&x.a. Later on, we can read from that reference.
  2. Given a sum type, we can pattern match on it and obtain a reference to the value inside of it.
  3. Given an array, we can obtain an iterator of the array. We can dereference the iterator to obtain the current value, and we can call next() to obtain a reference to the next item. The same goes for an iterator over a tree.

Note that "reference" in the sense I'm using is not necessarily just a pointer. A tree iterator may maintain a stack of parent nodes in order to be able to traverse a singly linked tree.

There's one nasty problem with references, which is that in the presence of mutation they can lead to memory unsafety or run time errors. Examples:

  1. Given the sum type (Either Int String), and x = Left 3, we can pattern match on x and obtain a reference r to the value 3 inside it. If we then mutate x := Right "foo", now the reference r is no longer valid. If r is represented as a plain old pointer, then if we dereference it we will try to read an int where now a string is stored.
  2. Given a tree iterator over some tree, we can go to somewhere in the middle of the tree. If we now mutate the tree, the iterator may no longer be valid.

As you can see, it's not just a problem with pointers. Even in a language like Java, you have problems with iterators when the underlying data structure is mutated. At run time they may throw ConcurrentModificationException.

Rusts type system attempts to prevent these kind of errors statically, but Rusts type system isn't fully finalized yet and it is quite complicated. I was wondering if there is existing literature on this problem and perhaps there is some elegant way to prevent this with types. It would probably involve linear types and capabilities.

If such a thing does not exist yet, one thing I can think of is to require a linear capability in order to mutate the value, and in order to obtain a reference to the inside of the value you must give up this capability. In return you get a capability that allows you to read the references. If you later want to mutate again, you have to give up the capability to read the references. Is this a good approach? Why/why not?

Those pesky higher-rank types. Or how to type \f x y. (f x, f y)

As people might have noticed I've been somewhat interested in how to type

g = \f x y. (f x, f y)

Now you might think HM, as implemented in various languages, does a nice job typing that. Ocaml gives the most readable result as

# let g f x y = (f x, f y);;
val g : ('a -> 'b) -> 'a -> 'a -> 'b * 'b = 

Or you could go higher rank and type it

g :: (forall a b. a -> b) -> a -> b -> (c, d)
g = \f x y -> (f x, f y)

But, for various reasons, I want to type it

def g: (a -> f a) -> b -> c -> (f b, f c) =
    [f, x, y -> (f x, f y)]

Since I am lazy and assume everything has been done in literature somewhere, does anyone have a reference with a typing system like that?

PS. 'Solved' in Haskell. Courtesy of Sjoerd Visscher, Haskell allows for quantification over type constructors; though I want to avoid explicit quantifcation.

g :: (forall a. a -> f a) -> b -> c -> (f b, f c)
g f x y = (f x, f y)

PPS. Gashe noticed that Haskell is unable to recognize that f should sometimes be unificated with 'id = /\a. a'

*Main> g (+ 2) 3 4

:27:4:
    Couldn't match type `a' with `f0 a'
      `a' is a rigid type variable bound by
          a type expected by the context: a -> f0 a at :27:1
    Expected type: a -> f0 a
      Actual type: a -> a
    In the first argument of `g', namely `(+ 2)'
    In the expression: g (+ 2) 3 4
    In an equation for `it': it = g (+ 2) 3 4

Facebook Flux: simplify the dataflow

An overview of Flux by Facebook.

Flux is the application architecture that Facebook uses for building client-side web applications. It complements React's composable view components by utilizing a unidirectional data flow. It's more of a pattern rather than a formal framework, and you can start using Flux immediately without a lot of new code.

What I'd personally really like to learn from people here on LtU is: What do you think should be make simple, possibly at the expense of other concerns? Apparently Flux is saying keeping dataflow as a unidirectional loop is an important thing for various reasons, presumably including mental modeling of it all. Are there other things you'd say are more important? Or do you have a language+environment such that nothing has to be simplified and kept simple, and yet still stays sane for regular developers to grok? (Other things that I think have a meme along these lines: DCI tries to eschew polymorphism; DDD tries to keep close to the business domain; Go developers refuse to add things other languages have; etc.)

Removing User Interface Complexity, or Why React is Awesome

This blog post has been going around Hackernews recently, and I was wondering if there was something deeper here about how new reactive programming paradigms are being adopted by the web crowd.

React by Facebook appears to be a twist on immediate-mode UI, limited one-way databinding, and something something for the web. Technologically, it doesn't seem that innovative (especially for us), but I guess the key is in its simplicity. Does anyone here have any experience using this, and how does it compare to approaches like FRP?

Explicit renaming of bound variables

Hi
I tried to make reductions for usual lambda-terms "as for DeBrujn's terms". May be, it will be interesting for somebody
\begin{align*}
&(\lambda xy.x)\,y && \\
&\to [y/x]\circ\lambda y.x && (Beta)\\
&\to \lambda y.\Uparrow\![y/x]_y\circ x && (Lambda)\\
&\to \lambda y.W_y\circ[y/x]\circ x && (LiftShift')\\
&\to \lambda y.W_y\circ y && (Var)\\
&\to \lambda z.\{zy\}\circ W_y\circ y && (\alpha)\\
&\to \lambda z.W_z\circ y && (IdShift)\\
&\to \lambda z.y && (W)
\end{align*}
The line \alpha is "explicit renaming of bound variable". Details are here
http://arxiv.org/abs/1303.5039
The text isn't easy.

Using real arithmetic to eliminate bounds checking?

The usual problem with eliminating array bounds checking is that integer arithmetic is undecidable. For example, the compiler can't figure out that you can index into an array of m·n elements using an expression like i·n+j, even if it's known statically that 0≤i<m and 0≤j<n. Or rather, that particular problem is simple, but it's hard to come up with a general algorithm that would work for many different programs. Many people are trying to use dependent types and machine-checkable proofs for that, like in Hongwei Xi's paper.

It just occurred to me that statements like "if i ∈ [0,m-1] and j ∈ [0,n-1], then i·n+j ∈ [0,m·n-1]" (all segments are closed) can be interpreted and proved in real arithmetic, instead of integer arithmetic. The nice thing is that real arithmetic is decidable. If that approach works, we can have a language with integer-parameterized types like "array of n elements" and "integer between m and n", and have a simple definition of what the type checker can and cannot automatically prove about programs using such types, without having to write any proofs explicitly.

Of course the system will be limited by the fact that it's talking about reals rather than integers. For example, there will be problems with integer-specific operations on type parameters, like division and remainder, but I don't know how often these will be required.

Are there any obvious problems with this idea? Has it been tried?

type theory about programming language?

I have a question for which I expect a good answer from Matt M, Andreas Rossberg, or Jules Jacobs, should any of these folks feel inclined to reply, because they clearly know a lot about type systems.

Is type theory math? How is discussing type theory also discussion of programming language? When analyzing types in a specific PL, I can see type theory informing analysis of that PL. But when talking about type systems in the abstract, apart from a PL, it just sounds like philosophy of algebra to me. I can't find the PL angle unless the idea is that all programming languages should be about math. What I'm trying to do is put type theory discussion in context. What is it good for? (I hope asking a simple brass-tacks question is not rude.)

I don't mind getting beat up as long as it's educational. :-) But I'm trying to express curiosity rather than a challenging attitude. Edit: I'm going from my personal reaction to type theory discussions, rather than anything I read. I hope to learn something, rather than push my impression that type theory is just math. So a reasoned contradiction would be interesting.

XML feed