LtU Forum

Tuples, functions, ghost functions and higher order functions

Software verification depends on clearly specified functions with well defined properties. Functions can be objects by themselves, i.e. they should be first class values.

Tuples can be used to treat functions with zero, one or more arguments and one or more return values uniformly. The article Tuples and functions demonstrate the versatility of tuples.

Functions might be computable or not computable. Clearly, in actual computations only computable functions can be used. However on the assertions level (when one reasons about computations) not computable functions are a powerful tool (as long as the functions are well defined). Ghost functions can be defined to fill this gap.

Higher order functions (i.e. functions that take functions as arguments and/or return functions as values) are standard in functional programming style. Higher order functions help to reason about software and can structure the assertions needed.

The article Ghost functions and higher order functions describes the basic concepts.

Evaluating the Design of the R Language

From our recent discussion on R, I thought this paper deserved its own post (ECOOP final version) by Floreal Morandat, Brandon Hill, Leo Osvald, and Jan Vitek; abstract:

R is a dynamic language for statistical computing that combines lazy functional features and object-oriented programming. This rather unlikely linguistic cocktail would probably never have been prepared by computer scientists, yet the language has become surprisingly popular. With millions of lines of R code available in repositories, we have an opportunity to evaluate the fundamental choices underlying the R language design. Using a combination of static and dynamic program analysis we assess the success of different language features.

Excerpts from the paper:

R comes equipped with a rather unlikely mix of features. In a nutshell, R is a dynamic language in the spirit of Scheme or JavaScript, but where the basic data type is the vector. It is functional in that functions are first-class values and arguments are passed by deep copy. Moreover, R uses lazy evaluation by default for all arguments, thus it has a pure functional core. Yet R does not optimize recursion, and instead encourages vectorized operations. Functions are lexically scoped and their local variables can be updated, allowing for an imperative programming style. R targets statistical computing, thus missing value support permeates all operations.

One of our discoveries while working out the semantics was how eager evaluation of promises turns out to be. The semantics captures this with C[]; the only cases where promises are not evaluated is in the arguments of a function call and when promises occur in a nested function body, all other references to promises are evaluated. In particular, it was surprising and unnecessary to force assignments as this hampers building infinite structures. Many basic functions that are lazy in Haskell, for example, are strict in R, including data type constructors. As for sharing, the semantics cleary demonstrates that R prevents sharing by performing copies at assignments.

The R implementation uses copy-on-write to reduce the number of copies. With superassignment, environments can be used as shared mutable data structures. The way assignment into vectors preserves the pass-by-value semantics is rather unusual and, from personal experience, it is unclear if programmers understand the feature. ... It is noteworthy that objects are mutable within a function (since fields are attributes), but are copied when passed as an argument.

But then, they do a corpus analysis to see what features programmers actually use! We don't do enough of these in PL; examples from the paper:

R symbol lookup is context sensitive. This feature, which is neither Lisp nor Scheme scoping, is exercised in less than 0.05% of function name lookups...The only symbols for which this feature actually mattered in the Bioconductor vignettes are c and file, both popular variables names and built-in functions.

Lazy evaluation is a distinctive feature of R that has the potential for reducing unnecessary work performed by a computation. Our corpus, however, does not bear this out. Fig. 14(a) shows the rate of promise evaluation across all of our data sets. The average rate is 90%. Fig. 14(b) shows that on average 80% of promises are evaluated in the first function they are passed into.

And so on. A lot of great data-driven insights.

Inheritance and formal verification of software

Inheritance is an important technique for abstraction in object oriented software. Functional languages like Haskell have some form of inheritance of concepts as well (e.g. the class Eq).

Verification techniques should therefore take inheritance into account.

By looking further into the topic it becomes evident that inheritance is a good vehicle for program verification as well i.e. inheritance helps verification.

Abstract classes can make some assumptions and prove some other properties based on the assumptions and the defined functions within the absract class.

A descendant of the abstract concepts just needs to substantiate the assumptions and can inherit all the proved proved properties of the abstract concept.

This technique is described in the article "inheritance" and demonstrated with some examples like equality, partial order and total order.

Frenetic

Frenetic is a Python-embedded combinator notation for writing the control software for OpenFlow software-defined networking switches.

Frenetic: A High-Level Language for OpenFlow Networks:

Most interfaces for programming network devices are defined at the low level of abstraction supported by the underlying hardware, which leads to complicated programs that are prone to errors. This paper proposes a high-level programming language for OpenFlow networks based on ideas originally developed in the functional programming community. Our language, called Frenetic, includes a rich pattern algebra for classifying packets, a “program like you see every packet” abstraction, and a run-time system that automatically generates the low-level packet-processing rules. We describe the design and implementation of Frenetic, and show how to use it to implement common management tasks.

Frenetic: A Network Programming Language:

This paper presents Frenetic, a high-level language for programming distributed collections of network switches. Frenetic provides a declarative query language for classifying and aggregating network traffic as well as a functional reactive combinator library for describing high-level packet-forwarding policies. Unlike prior work in this domain, these constructs are—by design—fully compositional, which facilitates modular reasoning and enables code reuse. This important property is enabled by Frenetic’s novel runtime system which manages all of the details related to installing, uninstalling, and querying low-level packet-processing rules on physical switches.

A Compiler and Run-time System for Network Programming Languages:

In this paper, we define a high-level, declarative language, called NetCore , for expressing packet-forwarding policies on SDNs. NetCore is expressive, compositional, and has a formal semantics. To ensure that a majority of packets are processed efficiently on switches—instead of on the controller—we present new compilation algorithms for NetCore and couple them with a new run-time system that issues rule installation commands and traffic-statistics queries to switches. Together, the compiler and run-time system generate efficient rules whenever possible and outperform the simple, manual techniques commonly used to program SDNs today. In addition, the algorithms we develop are generic, assuming only that the packet-matching capabilities available on switches satisfy some basic algebraic laws.

Specification and implementation of modules in Modern Eiffel

The article talks about modules in Modern Eiffel (a variant of Eiffel which allows static verification).

The main points are

1. integration of functional and object oriented style

2. physical separation of specification and implementation

ad 2: This separation is not only possible for functions and procedures, but also for assertions. I.e. the specification part includes only the assertions. The proofs can reside in the implementation part of a module.

Why and How People Use R

Compelling lecture by John Cook

Abstract:

R is a strange, deeply flawed language that nevertheless has an enthusiastic and rapidly growing user base. What about R accounts for its popularity in its niche? What can language designers learn from R's success?

Certified Programming with Dependent Types: home stretch!

For the last few years, I've been working on Certified Programming with Dependent Types, a book giving a pragmatic introduction to the Coq proof assistant. The drafts have been available for free online, and future versions should be as well; but I'm also working toward a traditional print version from MIT Press.

I've recently finished the last non-trivial content changes that I've had planned, so I write to you to ask for your help in finding the last mistakes and opportunities for improvement! Once everything is in good shape, I can move forward with traditional publication.

There are a few categories of known issues. One is font choice and other typesetting; this is coming from coqdoc, Coq's documentation generation program, and I expect to need to jump into the coqdoc source code to an extent, to get everything looking pretty without excessive countermeasures in the book source code.

Where I could especially use help is finding appropriate citations to research papers and so on. There are few subjects that I'm already looking for the right citations on, and I figure LtU is a great place to find people who have favorites in these categories. So, could you please let me know if you have a citation to recommend for any of the following?

  • the concept of continuation and/or continuation-passing style
  • the concept of unification
  • infinite data structures in Haskell
  • Haskell "deriving" clauses
  • some introduction to logic programming, to cite in lieu of going into full detail on the usual foundations of the paradigm

I'll also appreciate advice on any other points in the text that you think deserve citations, and I'll appreciate suggestions the most if they come with suggested citation targets. :)

And any other finding of typos, confusing prose, and so on will be much appreciated! You can e-mail me or reply to this post, as seems appropriate.

Thanks in advance! I'm looking forward to getting the book out there.

Symmetry in type systems

Type systems exhibit a certain symmetry that causes almost all features to be duplicated. For example: sums & products, covariance & contravariance, unions & intersections, universals & existentials, exstensible records & extensible variants, etc. In each case you can transform one into the other with callbacks (continuation passing).

Even a function type like int -> string is made up of two parts. If we see a type annotation E(v : int -> string), that is saying two things: (1) the value v, when given an int, will produce a string (2) the continuation E will only call v with an int, and will be happy to get a string in return. The type annotation is not just constraining what the value can be, but also what the continuation can be.

What should I read to understand this two-sidedness better? Some question I have are: in physics when there is symmetry in a problem, it can often be eliminated to make the problem simpler. Can the symmetry in type systems be eliminated? For example you can imagine a purely CPS language where you have a type K(T) that represents a continuation that takes a T. Then you have K(A + B) = K(A) x K(B), and similar for other types. For function types you have A -> B = K(A x K(B)). K essentially flips the constraints around: the constraints that were laid on the value are now laid on the continuation and vice versa. Has this been explored? What should I read?

Another question is: does it make sense to separate the two pieces of a type? For example does it make sense to have a function type that only says "when given an int, v produces a string" but does not guarantee that the continuation will only call it with an int (perhaps in combination with other features like intersection and union)? Does this two-sidedness arise in other languages, like logic languages, where it's not so clear what are inputs and what are outputs?

Reverend Bayes, meet Countess Lovelace: Probabilistic Programming for Machine Learning

Andy Gordon's talk, Reverend Bayes, meet Countess Lovelace: Probabilistic Programming for Machine Learning, is online.


We propose a marriage of probabilistic functional programming with Bayesian reasoning. Infer.NET Fun turns the simple succinct syntax of F# into an executable modeling language – you can code up the conditional probability distributions of Bayes' rule using F# array comprehensions with constraints. Write your model in F#. Run it directly to synthesize test datasets and to debug models. Or compile it with Infer.NET for efficient statistical inference. Hence, efficient algorithms for a range of regression, classification, and specialist learning tasks derive by probabilistic functional programming.

Pythonect 0.1.0 Release

Hi All,

I'm pleased to announce the first beta release of Pythonect interpreter.

Pythonect is a new, experimental, general-purpose dataflow programming language based on Python.

It aims to combine the intuitive feel of shell scripting (and all of its perks like implicit parallelism) with the flexibility and agility of Python.

Pythonect interpreter (and reference implementation) is written in Python, and is available under the BSD license.

Here's a quick tour of Pythonect:

The canonical "Hello, world" example program in Pythonect:

>>> "Hello World" -> print
<MainProcess:Thread-1> : Hello World
Hello World
>>>

'->' and '|' are both Pythonect operators.

The pipe operator (i.e. '|') passes one item at a item, while the other operator passes all items at once.

Python statements and other None-returning function are acting as a pass-through:

>>> "Hello World" -> print -> print
<MainProcess:Thread-2> : Hello World
<MainProcess:Thread-2> : Hello World
Hello World
>>>

>>> 1 -> import math -> math.log
0.0
>>>

Parallelization in Pythonect:

>>> "Hello World" -> [ print , print ]
<MainProcess:Thread-4> : Hello World
<MainProcess:Thread-5> : Hello World
['Hello World', 'Hello World']

>>> range(0,3) -> import math -> math.sqrt
[0.0, 1.0, 1.4142135623730951]
>>>

In the future, I am planning on adding support for multi-processing, and even distributed computing.

The '_' identifier allow access to current item:

>>> "Hello World" -> [ print , print ] -> _ + " and Python"
<MainProcess:Thread-7> : Hello World
<MainProcess:Thread-8> : Hello World
['Hello World and Python', 'Hello World and Python']
>>>

>>> [ 1 , 2 ] -> _**_
[1, 4]
>>>

True/False return values as filters:

>>> "Hello World" -> _ == "Hello World" -> print
<MainProcess:Thread-9> : Hello World
>>>

>>> "Hello World" -> _ == "Hello World1" -> print
False
>>>

>>> range(1,10) -> _ % 2 == 0
[2, 4, 6, 8]
>>>

Last but not least, I have also added extra syntax for making remote procedure call easy:

>>> 1 -> inc@xmlrpc://localhost:8000 -> print
<MainProcess:Thread-2> : 2
2
>>>

Download Pythonect v0.1.0 from: http://github.com/downloads/ikotler/pythonect/Pythonect-0.1.0.tar.gz

More information can be found at: http://www.pythonect.org

I will appreciate any input / feedback that you can give me.

Thanks,

Itzik Kotler

XML feed