Programming Shorthands

We propose programming language mechanisms to reduce redundancy in program source code. These abbreviation mechanisms, shorthands, make programs shorter and easier to write and read. In addition, we provide a framework for describing language abbreviation mechanisms.

An interesting and odd paper from Todd Proebsting (of "Proebsting's Law"fame) and Ben Zorn. If you like $_ and @_ in Perl, then you may like this, too. I can't recall seeing any other papers on this topic, so pointers are welcome.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

more by Proebsting & Zorn

This paper was written by both Todd Proebsting and Ben Zorn (as clicking the link to the paper would quickly let you know). They wrote another paper similarly focused on making programs more concise, called Tangible Program Histories.

-- Emery

Story updated

I added Ben Zorn's name to the story. Thanks for pointing that out.

Inside-out

There are some interesting ideas there, but I think they need to turn their constructs inside-out, and have programs work implicitly on lists of elements. Instead of...
foo.diffusion_array[i] = xi;
                   [j] = yi;
                   [k] = zi;
...you'd do something like...
foo.diffusion_array[{i,j,k}] = {xi,yi,zi};
...likewise...
  • fn(a, 0, 1, 2);
    fn(b, ...);
    fn(c, ...);
    
  • fn({a,b,c}, 0, 1, 2);
    
...and...
  • foo(x + y, z, bar(g));
    foo(x - y, ...);
    foo(x + 1, ...);
    foo(x - 1, ...);
    
  • foo({x+y, x-y, x+1, x-1}, z, bar(g));
    
...
  • planet1.spatial_dist.pt.x.velocity = v1;
    planet2.spatial_dist.pt.x.velocity = v2;
    
    planet1...mass = m1; 
    planet2...mass = m2;
    
    planet1...accel = a1; 
    planet2...accel = a2;
    
  • planet1.{spatial_dist.pt.x.velocity, mass, accel} = {v1,m1,a1}; 
    planet2.{spatial_dist.pt.x.velocity, mass, accel} = {v2,m2,a2};
    
  • planet{1,2}.{spatial_dist.pt.x.velocity, mass, accel} = {{v1,m1,a1},{v2,m2,a2}};
    

Icon

My Icon is a little bit rusty but I think you could do some of your examples with the every operator:

every fn(a|b|c, 0, 1, 2)
every foo(x+y|x-y|x+1|x-1, z, bar(g))

The first example would require a kind of parallel comprehension because plain usage of | with every has a cartesian product semantics:

every (i,x) := (i,xi)|(j,yi)|(k,zi) do 
    foo.diffusion_array[i] := x


If we define a every-zip operator for parallel generator evaluation it would match your example:

every-zip foo.diffusion_array[i|j|k] := xi|yi|zi

Icon

Perhaps that is why Proebsting wrote this paper just three months after his paper on a new implementation of Icon.

A small change

My partner and I looked at this, particularly

foo.diffusion_array[{i,j,k}] = {xi,yi,zi};

and

planet1.spatial_dist.pt.x.{velocity, mass, accel} = {v1,m1,a1};

and he immediately pointed out that this syntax requires correlating the order of items in two separate sets, a personal bugaboo. He suggested something along the lines of

foo.diffusion_array[{i = xi, j = yi, k = zi}];

and

planet1.spatial_dist.pt.x.{velocity = v1, mass = m1, accel = a1};

instead (that is, a tuple of pairs of elements, rather than a pair of tuples of elements), which is just a small step forward from C#3's object initializer syntax (sample from linked page):

class Phone {
    public int countryCode;
    public int areaCode;
    public int number;
}

class Customer {
    public string name;
    public string address;
    public Phone phone;
}

static void Main(string[] args) {
    var cust = new Customer {name = "Abhinaba Basu",
                      address = "1835 73rd Ave NE, Medina, WA 98039",
                      phone = new Phone {countryCode = 1, areaCode = 425, number = 9999999}};
}

Zip or Cross-Product?

I like the idea. What about this piece of code:

fn({a,b,c}, 0, 1, {0,1,2});

Does it yield:

fn(a, 0, 1, 0);
fn(b, 0, 1, 1);
fn(c, 0, 1, 2);c

or does it yield:

fn(a, 0, 1, 0);
fn(a, 0, 1, 1);
fn(a, 0, 1, 2);
fn(b, 0, 1, 0);
fn(b, 0, 1, 1);
fn(b, 0, 1, 2);
fn(c, 0, 1, 0);
fn(c, 0, 1, 1);
fn(c, 0, 1, 2);

Or, to put it another way, does it 'zip'* or does it do cross-product? Maybe better, is there a way you can signify which to do? Your last example might have the answer--it seems like a cross between a zip and a cross-product, but it does not seem clear to me which parts to cross and which parts to zip.

* I use the term zip because it's the term I'm familiar with, as in Haskell's "zip [1,2,3,4] ['a','b','c','d']". There is probably a better term.

Zip Comprehensions

Taking the cue from Daniel Yokomizo, we could use something like GHC's parallel/zip comprehensions, and a different syntax markers for zips vs. crosses. For example, make the stuff in curly brackets be the marker for cross products, and square brackets for the zips. So we might get something like...
{-# OPTIONS -fglasgow-exts #-}
main = do display planets
          display fn1
          display fn2
          display g

display x = mapM_ putStrLn x >> putStrLn ""

--planet{1,2}.{spatial_dist.pt.x.velocity, mass, accel} = [v1,m1,a1,v2,m2,a2];
planets = ["planet"++p++"."++sel++" = "++val
            | p   <- ["1","2"]
            , sel <- ["spatial_dist.pt.x","mass","vel"]
            | val <- ["v1", "m1", "a1", "v2", "m2", "a2"]]

--fn({a,b,c}, 0, 1, {0,1,2});
fn1 = ["fn("++x++",0,1,"++y++");" 
        | x <- ["a","b","c"]
        , y <- ["0","1","2"]]

--fn([a,b,c], 0, 1, [0,1,2]);
fn2 = ["fn("++x++",0,1,"++y++");" 
        | x <- ["a","b","c"]
        | y <- ["0","1","2"]]

--g({a,b,c},[1,2,3,4,5,6],{foo,bar},[7.0,8.0,9.0,10.0,11.0,12.0]);
g   = ["g("++a++","++n++","++f++","++r++");"
        | a <- ["a","b","c"]
        , f <- ["foo","bar"]
        | n <- map show [1,2,3,4,5,6]
        | r <- map show [7.0,8.0,9.0,10.0,11.0,12.0]]
...essentially, you put a comma before each group delimited by "{}" and a bar before the "[]"s. Also to get this to work with GHC, you have to sort the groups so that all the zip portions come at the end of the comprehension. Maybe there's some insight that can be gained from the array community (APL,K,J) on this topic?

Best language to write this in?

Ignoring for the moment the complication of parallel zips and cartesian products, what would be a good language (static/dynamic/whatever) to write the following family of functions in?
lift :: (a->b)       -> m a -> m b

lift :: (a->b->c)    ->   a -> m b -> m c
lift :: (a->b->c)    -> m a ->   b -> m c
lift :: (a->b->c)    -> m a -> m b -> m c
 
lift :: (a->b->c->d) ->   a ->   b -> m c -> m d
lift :: (a->b->c->d) ->   a -> m b ->   c -> m d
lift :: (a->b->c->d) ->   a -> m b -> m c -> m d
lift :: (a->b->c->d) -> m a ->   b ->   c -> m d
lift :: (a->b->c->d) -> m a ->   b -> m c -> m d
lift :: (a->b->c->d) -> m a -> m b ->   c -> m d
lift :: (a->b->c->d) -> m a -> m b -> m c -> m d

... etc. ...

...I'm thinking it might almost be doable in Haskell+extensions and some pretty heavy-duty type class hackery. But I'm wondering if there's a language out there that makes this semi-straight-forward.

For those interested in taking up the challenge, here are some tests...

if_ p t e = if p then t else e
 
tests_passed = and [
  lift succ [1,2,3] == [2,3,4],
   
  lift (+) 1 [2,3,4]       == [3,4,5],
  lift (+) [1,2,3] 4       == [5,6,7],
  lift (+) [1,2,3] [4,5,6] == [5,7,9],
 
  lift if_ True        'a'  "bc"  == ['a','a'], 
  lift if_ True        "ab" 'c'   == ['a','b'], 
  lift if_ True        "ab" "cd"  == ['a','b'],
  lift if_ [True,False] 'a' 'b'   == ['a','b'],
  lift if_ [True,False] 'a'  "bc" == ['a','c'],
  lift if_ [True,False] "ab" 'c'  == ['a','c'],
  lift if_ [True,False] "ab" "cd" == ['a','d']]

Implementation in Lisp

Here's a first pass at implementing the zips-only version in Common Lisp. You use "$" to introduce a sequence to be zipped over. The type of the return value's sequence is the same as the type of the sequence of the first "$" group.
(eval-when (:compile-toplevel :load-toplevel)  
  (set-macro-character #\$
    (lambda (stream char)
      (declare (ignore char))
      `(make-zips :val ,(read stream t nil t)))))

(defstruct zips val)

(defun cycle (x)
  (let ((xs (list x)))
    (setf (cdr xs) xs)))

(defun lift (f)
  (lambda (&rest args)
    (let ((out-type (type-of (slot-value 
                                (first (member-if #'zips-p args)) 'val)))
          (promoted (mapcar (lambda (x) (if (zips-p x) 
                                            (slot-value x 'val) 
                                            (cycle x))) args)))
      (apply #'map out-type (lambda (&rest a) (apply f a)) promoted))))

(defun if_ (pred then else)
  (if pred then else))

(defun tests-passed? ()
  (every (lambda (x) x) 
    (list 
      (equalp (funcall (lift #'+) $'(1 2 3) 4) '(5 6 7))
      (equalp (funcall (lift #'+) 1 $#(2 3 4)) #(3 4 5))
      (equalp (funcall (lift #'+) $'(1 2 3) $'(4 5 6)) '(5 7 9))
      (equalp (funcall (lift #'if_) t #\a $"bc") "aa")
      (equalp (funcall (lift #'if_) t $"ab" #\c) "ab")
      (equalp (funcall (lift #'if_) t $"ab" $"cd") "ab")
      (equalp (funcall (lift #'if_) $'(t nil) #\a #\b) '(#\a #\b))
      (equalp (funcall (lift #'if_) $'(t nil) #\a $"bc") '(#\a #\c))
      (equalp (funcall (lift #'if_) $'(t nil) $"ab" #\c) '(#\a #\c))
      (equalp (funcall (lift #'if_) $'(t nil) $"ab" $"cd") '(#\a #\d))
      (equalp (funcall (lift #'mapcar) 
                         $(list #'1+ (lambda (x) (* 2 x))) 
                         '(4 5 6)) '((5 6 7) (8 10 12))))))

(assert (tests-passed?))

Lambda the ultimate

Lambda, the ultimate abstraction:

(let ((a (diffusion-array foo)))
  (vector-set! a i xi)
  (vector-set! a j yi)
  (vector-set! a k zi))

(in the case above, we're suffering from the weight of the vector-set!, but that's a different issue...)

(let ((f (lambda (x) (fn x 0 1 2))))
  (map f (list a b c)))

(let ((f (lambda (x) (foo x z (bar g)))))
  (map f (list (+ x y) (- x y) (+ x 1) (- x 1))))

(let ((x1 (x (pt (spatial-dist planet1))))
      (x2 (x (pt (spatial-dist planet2)))))
  (set-velocity! x1 v1)
  (set-velocity! x2 v2)
  (set-mass! x1 m1)
  (set-mass! x2 m2)
  (set-accel! x1 a1)
  (set-accel! x2 a2))
;; or even
(let ((f (lambda (p v m a)
           (let ((x (x (pt (spatial-dist p)))))
             (set-velocity! x v)
             (set-mass! x m)
             (set-accel! x a)))))
  (f planet1)
  (f planet2))

The authors' response would

The authors' response would be that you're also introducing meaningless names (e.g. "a" or "f") in each case. Their point in this case is that just as in natural languages names are introduced implicitly by some constructs, so too they could be in programming languages. For example, when we say "Sean went to the store; he bought some cheese," the use of a male person's name binds "he". We don't have to say "let 'he' refer to Sean henceforth; he went to the store; he bought some cheese."

That's anaphora, but...

The "he" example is an example of the the anaphora that Dave Herman described, which is used in some Lisp macros.

But the paper is going further than that, introducing implicit subjects which don't even use a name like "he" or "it".

I might translate the first two examples into comparable Scheme as follows:

; chaining vector set, after the Smalltalk example by Thorsten Seitz
(define (vs! v i x)
  (vector-set! v i x)
  (lambda (i x) (vs! v i x)))

; example 1
(((vs! foo.diffusion_array i xi)
                           j yi)
                           k zi)

; example 2
; uses SRFI 26, "Notation for Specializing Parameters without Currying"
(map (cut fn <> 0 1 2)
     (list   a
             b
             c))

The second example is perhaps less pretty, but more general than the one in the paper, unless the paper defines semantics for cases like (fn ... a ...) somehow.

You're right, of course -- I

You're right, of course -- I was responding narrowly to the parent post. More broadly, however, we can get a similar effect with parallel sentence structure (at least in English): "I gave Alice 7 apples, Bob 3, Charlie 4." or even "I gave Alice 7 apples, Bob pears, Charlie oranges." This is harder to describe formally, and may have too much semantic tie-in to work in a programming language.

The written equivalent, which the authors even used in the paper IIRC, is ditto marks.

You know what they say...

Unfortunately, as we all know natural language and its use of pronouns can easily lead to ambiguous situations. Consider "Joe entered the bar and said hello to the bartender. He was wearing a hat." Who is wearing the hat?

This is why, when one needs to be precise, identities are indeed bound to new names -- take a look at most legal documents. "This license, granted by SuperDuperMegaCorp (hereafter referred to as "the company"), grants the purchaser of The Whizzbanggy Widget Set (hereafter referred to as "the sucker") ..."

Anyway this kind of idea suffers from the same type of problems, only they may actually be much more confusing, depending on what type of scoping rules you require the pronouns to play under. And it suffers from other problems that we generally don't see with natural language, as we don't have people adding and deleting lines to, say, War and Peace, or rearranging sections of the text as part of ongoing maintenance and development. If you're not paying attention to what you're manipulating, all of a sudden the pronouns take on different meanings and all is not so well. It's bitten me a few times in the past when I was using numbered regex groups -- goodness knows what kind of mischief their proposed $() construct could lead to...

Right, but we don't outlaw

Right, but we don't outlaw pronouns, or even discourage their use in formal writing.
Anything can be abused.

Call for backup?

So what are the tools that help bail somebody out who has been thrown overboard by somebody else's refactoring?

(I think saying "just because it can be abused doesn't mean we shouldn't have it" is sorta irresponsible. You have to figure out just how abusable it is and what the consequences of the abuse would be. Then if you are ethical I think you have to figure out how to shore-up against those situations.)

plug

“I can't recall seeing any other papers on this topic, so pointers are welcome.”
Taking the occasion to plug myself here, almost 10 years back I wrote an article on a very similar topic.

anaphora

I haven't had time to read the article, but this makes me think of anaphora, a context-sensitive reference to another expression (caveat: IANA linguist). One example of anaphora in programming languages is anaphoric if, which implicitly (unhygienically) binds the name it to the result of the test expression so it can be referred to within the consequent:

(aif (foobar) (+ it 42) (error 'nothing))

Ken Shan's dissertation draws another interesting relationship between anaphora and programming languages.

Smalltalk

In Smalltalk you can do some of these things using cascades which is very useful:

foo diffusionArray
   at: i put: xi;
   at: j put: xj;
   at: k put: xk

planet1 spatialDist point x 
   velocity: v1;
   mass: m1;
   accel: a1

Maybe your PL is just too "static"!

Redundancy in source code is often a sign that something in our language isn't first-class when it might be. In this case, we might:

  • make aspects of syntax first-class (macrology and metaprogramming), and
  • make aspects of program structure first-class (reflection).

This choice, made seriously, leads to the kind of metaprogramming that makes Ruby (and Python, to a lesser degree) so popular. The authors argue against ugly macrology, but to my eyes, there's nothing terribly ugly about:

path = 'spatial_dist.pt.x'
eval <<-"end;"
  planet1.#{path}.velocity = v1
  planet2.#{path}.velocity = v2
  planet1.#{path}.mass = m1
  planet2.#{path}.mass = m2
end;

(Well, at least it's not visually ugly...) The ability to do this kind of thing is really the much more important notion of "dynamic" that fans of these languages get excited about. Of course it's not totally unrelated to dynamic typing, but we know by now that constructs like this are useful, and we know it's worth fighting to analyze them when possible.

Not visually ugly?

There actually is a visually ugly part of that: editor support. How is your text editor supposed to know that the contents of that big string are code? When you're editing that code, you don't have the advantage of automatic indentation or syntax highlighting beyond "make it all string-colored". Now, it wouldn't be too hard to work around that for the specific "eval <<-..." case, but in general it would be nice if the language provided an idiomatic and well-supported way of writing code at runtime to evaluate. Example (in no particular programming language):

path = 'spatial_dist.pt.x'
planet1.#{path}.velocity = v1
planet2.#{path}.velocity = v2
planet1.#{path}.mass = m1
planet2.#{path}.mass = m2

Or, fancier (and more contrived):

def return_setter_code(planet, v, m):
  return runtime:
    path = 'spatial_dist.pt.x'
    planet.#{path}.velocity = ##{v}
    planet.#{path}.mass = ##{m}
  end
end

eval return_setter_code(planet1, m1, v1)
eval return_setter_code(planet2, m2, v2)

This introduces new syntax: there's the "runtime: ... end" block which is shorthand for making a string with source code in it and which can contain #{expr} to insert the evaluation of expr in the enclosing scope and ##{expr} for another level of enclosing scope, and so on. There's also the ability to use the #{expr} syntax without an enclosing "runtime: ... end" block, which just adds an implicit runtime block (and an eval, if it's being used outside a runtime block).

This is suspiciously similar to Lisp's macro facilities, with the `(backtick ,and ,,@quasiquote) stuff. In fact, I think it's almost exactly the same; eventually this proposed metaprogramming scheme would need gensym() and once_only() and all the other fun, fun aspects of Lispy metaprogramming.

I would use it.

yes, but... dependent syntax?

This is suspiciously similar to Lisp's macro facilities, with the `(backtick ,and ,,@quasiquote) stuff. In fact, I think it's almost exactly the same; eventually this proposed metaprogramming scheme would need gensym() and once_only() and all the other fun, fun aspects of Lispy metaprogramming.

This is more or less the approach taken by MetaML and MacroML, and by Dylan's macros (if I understand them correctly). One difficulty of this approach as opposed to a pure string-based approach is that it's often difficult to produce certain program fragments whose intermediate forms don't consist of well-formed syntax. For example, in our running example, it's not clear what "spatial_dist.pt.x" is, as context-less syntax. As a string, it could even be reasonably spliced into multiple different syntactic contexts (as part of a string constant, as an object traversal expression, the final "x" might be the beginning of a longer identifier, etc). This is very powerful, but of course since it's so undisciplined it's very hard to say anything about it statically.

By analogy with types, I suggest that we might call this kind of macrology "dependent syntax", in which even syntactic well-formedness of a program depends on the semantics of the program. I suspect it may be possible to say a lot more about dependently parsed programs than has been said so far... (And, witness the perennial popularity of string-based metaprogramming, I suspect it would be worth saying.)

Metaprogramming in Q

Another way to go about this are user-defined special forms as Q (a dynamically typed functional programming language based on term rewriting) provides them. This kills two birds with one stone: It gives you both lazy data structures and dynamic (runtime) macros. The latter makes for a fairly powerful metaprogramming facility. E.g., in Q the lambda construct is just another special form (which could actually be defined by the programmer, but is provided as a builtin for efficiency), and you can define list comprehensions on top of this with the following little Q snippet (in fact I copied that definition straight from Q's standard library):

listof A Cs		= `(listofx A Cs);

listofx A ()		= '[A];
listofx A (X in Xs|Cs)	= '(cat (map (\X.`(listofx A Cs)) Xs))
			    if isvar X; // will always match
			= '(cat (map (\X.`(listofx A Cs))
				 (filter (matchp X) Xs)));
listofx A (X in Xs)	= '(cat (map (\X.[A]) Xs)) if isvar X;
			= '(cat (map (\X.[A]) (filter (matchp X) Xs)));
listofx A (P|Cs)	= '(ifelse P `(listofx A Cs) []);
listofx A P		= '(ifelse P [A] []);

The listofx function constructs the actual "code" (an expression of nested lambdas, cat, map, filter and if-then-else applications) which is returned as a quoted expression and then implicitly evaluated when it gets unquoted with the backquote operator (first line, definition of listof). The backquote operator can also be used to evaluate and splice a subexpression into a quoted expression (as can be seen in the 3rd and 4th equation).

Expressions are first class citizens in Q which you can manipulate to your heart's content before actually evaluating them. That makes Q's special forms much more powerful than, say, good ol' Algol's call by name or Alice's lazy futures. They are like Lisp macros, only fully dynamic.