LtU Forum

Dimensional Analysis in real world and Type Systems

I'm not sure I know what I'm talking about...

Isn't dimensional analysis used in natural science a kind of type checking? If so, doesn't it mean Curry–Howard isomorphism is applicable and, therefore, dimensional analysis is simply theorem proofing?

A reversible Javascript interpreter

I posted a while ago about a meta-circular Javascript interpreter I've been working on (now called Tailspin).

It is now reversible. As well as running in the browser and allowing algorithms to be visualised, Tailspin can now run code both forwards and backwards (using lots of continuations).

Being reversible provides some really interesting opportunities for exploring program execution and aiding learning to code. So far I have focused on algorithm visualisation where reversibility makes it possible to scrub through the execution of algorithms. This is not just fun, it transforms the experience of understanding the algorithm. I've been surprised at how simple it has been to create some really good visualisations (see the wiki), it's a little hacky at the moment, but it's really easy using the wiki to create an algorithm visualisation and without any work to be able to scrub back and forth through it.

Suggestions for uses or improvements are very welcome.


More info: http://will.thimbleby.net/reversible-javascript/
Wiki: http://will.thimbleby.net/algorithms
IDE: http://will.thimbleby.net/algorithms/inspector

MacroPy: Bringing Macros to Python. Quasiquotes, case classes, LINQ and more!

Hey All,

MacroPy (https://github.com/lihaoyi/macropy) is a project we've been working on, which gives you the ability to (very easily) perform AST manipulations on a Python program, and runs on a unmodified Python runtime (CPython 2.7 or PyPy 2.0).

We have a whole suite of demo macros, which I think are all pretty compelling use cases for macros in a language which traditionally doesn't have such functionality. They are pretty different from what comes up if you'd google "lisp macros"!

We think this is super cool, so hopefully you find it interesting. Looking forward to any feedback!

Thanks!
-Haoyi

what's a useful definition of equivalent functions?

Equivalence of functions is, in general, undecidable. But specific instances of it are usually decidable. Equivalence can be proven via structure analysis, or disproven via structure analysis or by example. Equality can be proven on functions whose input is limited to a finite set simply by exhaustive testing of all possible inputs. So, while it's true that there will always be an infinite set of pairs of functions for which we cannot prove equivalence or nonequivalence, it isn't really the case that we can't ever prove anything.

But what notion of 'equivalence' is useful?

What if we have two functions that are provably equivalent where their results are defined, but result in errors in different cases, or result in *different* errors in some cases? If I can prove, in the context of a particular program, that no possible call to a function will be a member of either error-causing set, then the functions are equivalent ... for purposes of that program. ie, they are observationally equivalent for the particular subset of valid inputs that we're interested in.

What if we have two functions that do equivalent things to members of slightly different types? For example, if I have two string-concatenation functions, but one of them concatenates a string class on which fifteen other operations are defined and one of them concatenates a string class on which thirteen of the _same_ other operations and three _different_ other operations are defined? These concatenation functions are not exactly mathematically equivalent, but in some very important way they are exactly the same, and may even be expressed by exactly the same source code. If a mathematician said these functions weren't equivalent, as an engineer I would roll my eyes and ignore him. If my job required me to satisfy his useless notion of equivalence, I would *make* the functions equivalent by just defining the 'missing' operations in both classes.

What about having 'equivalent' multiplication functions on, say, ints and rationals? If I posit that there is a multiplication function on a 'number' type that both of the other functions are subsets of (that is, they give equal results on equal inputs, for all inputs they can be given), is that a useful notion of equivalence?

What if we have two functions that return exactly the same results in all cases, and we can prove it ... but one of them takes N-squared memory and NlogN runtime, and the other takes N-squared runtime and NlogN memory? As an engineer I'd find this semantic equivalence and resource-requirement nonequivalence useful and swap one for the other depending on whether speed or memory use were more important. As a mathematician, I don't even know whether I'd be working with a notion of equivalence that would note the difference.

In the end, 'equivalence' is any relation that's transitive, symmetric, and reflexive. There's an infinite number of choices of such relations, and people who talk about function equivalence often do so without saying which equivalence they mean. Or, importantly, which equivalence is most useful in some relevant context.

And while you're chewing on the question of what 'equivalence' is most useful in a given context, what notion of proof is most useful to establish it?

What if we have a pair of functions which has been observed to return the same answer for the same inputs on a randomly-selected billion possible inputs ... and monitoring both has revealed that we have achieved 100% code coverage in both functions while doing so ... and other monitoring, plus a few million not-so-randomly-selected inputs, has assured that all possible combinations of taken and not-taken conditional branches in both functions have been exercised, on inputs that differ from each other by the theoretical minimum possible for that input type? As a mathematician, I couldn't claim that I know these functions are equivalent. As an engineer, I'd consider the indications of function equality to be strong enough that I wouldn't mind the absence of a proof. I'd feel comfortable swapping one for the other, especially if they are both relatively short.

Why short? Shortness limits the possible semantic complexity to some wobbly notion of a boundary, making it less likely that one or both contains subtleties not revealed by the extended fuzz-test. If one or both of the functions were twice as long, I'd want more fuzz-testing before I'd feel as comfortable swapping them. This corresponds to some intuition about reaching a particular certainty level, but that certainty level, no matter how you slice it, is less than an absolute proof.

So, here's a question. Is it useful for programmers to go with some 'heuristic' of function equality, or an engineering-grade 'indication' of function equality, if no mathematically certain proof is available and by doing so they can achieve measurably better results? Is it worthwhile for PL theorists and researchers to seek and refine such 'heuristics' and ways of estimating the strength of such 'indications' (rather than 'proofs') of function equality?

In mechanical engineering, strength of materials is always an estimate. In programming, sometimes the strength of our knowledge is also an estimate. This is a problem, but not necessarily a disaster, in terms of producing useful results. Mechanical engineers compensate by including a 'safety margin' well beyond the requirements of the particular job. How should programmers compensate?

getting feet wet re: Computational Thinking

C.T. has been mentioned on LtU before. I do hope that it gets ever wider attention. I like this paper in that the approach is not mainly about making people be able to be programmers. It is about making people be able to understand things e.g. so they can (as I see it) understand items in the news, or talk about things over coffee. Even people who learn to program often miss the forest for the trees, I think.

How Computers Work: Computational Thinking for Everyone

What would you teach if you had only one course to help students grasp the essence of computation
and perhaps inspire a few of them to make computing a subject of further study? Assume they have
the standard college prep background. This would include basic algebra, but not necessarily more
advanced mathematics. They would have written a few term papers, but would not have written
computer programs. They could surf and twitter, but could not exclusive-or and nand. What about
computers would interest them or help them place their experience in context? This paper provides
one possible answer to this question by discussing a course that has completed its second iteration.
Grounded in classical logic, elucidated in digital circuits and computer software, it expands into
areas such as CPU components and massive databases. The course has succeeded in garnering the
enthusiastic attention of students with a broad range of interests, exercising their problem solving
skills, and introducing them to computational thinking.

"The course includes some computer programming, but does not dwell on it."

I wish more real-world day-job programmers (including refreshing my own self I can't be too hypocritical) really learned these things:

The big ideas in the honors course are
1. the correspondence between digital circuits and formulas in logic,
2. how abstractions facilitate combining solutions to small problems to form solutions to big ones,
3. how algebraic formulas can specify computations,
4. how models expressed in software capture the behavior of processes and devices,
5. how important, complex algorithms derive from simple, definitional properties,
6. how different definitional properties can produce the same results at vastly different computational
expense,
7. how computational expense makes some useful devices feasible and renders others infeasible,
8. and how all of these ideas bear on the ability of computers to deal with information on the massive
scale needed to provide services like search engines, internet storefronts, and social networks

Terra: A low-level counterpart to Lua

A very interesting project developed by Zachary DeVito et al at Stanford University:

Terra is a new low-level system programming language that is designed to interoperate seamlessly with the Lua programming language:

-- This top-level code is plain Lua code.
print("Hello, Lua!")

-- Terra is backwards compatible with C
-- we'll use C's io library in our example.
C = terralib.includec("stdio.h")

-- The keyword 'terra' introduces
-- a new Terra function.
terra hello(argc : int, argv : &rawstring)
    -- Here we call a C function from Terra
    C.printf("Hello, Terra!\n")
    return 0
end

-- You can call Terra functions directly from Lua
hello(0,nil)

-- Or, you can save them to disk as executables or .o
-- files and link them into existing programs
terralib.saveobj("helloterra",{ main = hello })

Like C, Terra is a simple, statically-typed, compiled language with manual memory management. But unlike C, it is designed from the beginning to interoperate with Lua. Terra functions are first-class Lua values created using the terra keyword. When needed they are JIT-compiled to machine code.

Seems as if the target use case is high-performance computing. The team has also released a related paper, titled Terra: A Multi-Stage Language for High-Performance Computing:

High-performance computing applications, such as auto-tuners and domain-specific languages, rely on generative programming techniques to achieve high performance and portability. However, these systems are often implemented in multiple disparate languages and perform code generation in a separate process from program execution, making certain optimizations difficult to engineer. We leverage a popular scripting language, Lua, to stage the execution of a novel low-level language, Terra. Users can implement optimizations in the high-level language, and use built-in constructs to generate and execute high-performance Terra code. To simplify meta-programming, Lua and Terra share the same lexical environment, but, to ensure performance, Terra code can execute independently of Lua’s runtime. We evaluate our design by reimplementing existing multi-language systems entirely in Terra. Our Terra-based auto-tuner for BLAS routines performs within 20% of ATLAS, and our DSL for stencil computations runs 2.3x faster than hand-written C.

Unordered pairs and their representation

Occasionally situations arise where a type of unordered pairs would be very handy.

Perhaps the most important example is for representing edges in undirected graphs.

I came across another example that is instructional in a paper by C.J. Date entitled "The Primacy of Primary Keys: An Investigation". He wonders how to define a relation to store information about marriages, under the definition that a marriage is between two people. He gives a definition like:

MARRIAGE ( HUSBAND, WIFE, DATE_OF_MARRIAGE )
    CANDIDATE KEY ( HUSBAND, DATE_OF_MARRIAGE )
    CANDIDATE KEY ( WIFE, DATE_OF_MARRIAGE )

and raises the issue that there is no reason to prefer designating one key or the other as primary. If you amend the example so that there are not distinguished roles like "husband" or "wife" that could provide sensible labels for two separate attributes each storing a single person identifier, then this seems to be a natural spot for an unordered pair type. (Please refrain from discussing the politics of this definition, there are plenty of places on the internet for that.)

I set about thinking of how to make such a type and what its interface could be. I ended up with nearly the exact same interface I had for general sets. I am working on a language for expressing database queries, so I want to strictly control non-determinacy. (Adopting Haskell concrete syntax for purposes of communication...) I have a member :: a -> UnorderedPair a -> Bool containment test, an instance Eq a => Eq (UnorderedPair a), and a function like elements :: UnorderedPair a -> Set a. There's no pattern matching on unordered pairs because it would be so easy to accidentally shoot yourself in the foot with it, and no pattern matching on sets either. There is toList :: Set a -> NonDeterministic [a] though if you need an escape hatch. I also have an instance Ord a => Ord (UnorderedPair a) and inorder :: Ord a => UnorderedPair a -> (a, a).

Overall I'm reasonably happy with this. But I got to wondering whether it would be better to have a more general facility and view unordered pairs as a special case of it. You can look at them as the quotient of pairs with respect to an equivalence relation that permutes the ordering, and then you get unordered triples and ... too, if you can think of anything to use them for. Or you can look at them as a refinement type of sets with cardinality 2.

Those definitions of course differ on whether (3,3) :: UnorderedPair Integer though, and it seems like there may be applications for both answers. A fair amount of literature discusses undirected graphs with loops edges from one vertex to itself. But at least as much literature discusses graphs where such edges are disallowed, and disallowing them seems more natural for the marriage example. I suppose with general purpose refinement types you could look at sets with cardinality in [1 : 2], having your cake and also eating it.

The quotient type approach struggles with one of my goals, because presumably the pattern matching function that applies on ordered pairs would still apply to the quotient unordered pairs without flagging the result as non-deterministic.

If you take the ad hoc approach (neither quotients nor refinement types) it is difficult to think of names for the unordered pairs with and without allowing the degenerate pairs relating a value to itself, so that is another challenge.

I wasn't able to dig up too many examples through search engines, does anyone have thoughts on the best approach to modeling these?

texts & otherwise for bottom up teaching/learning

Hi Folks,

My son is now studying computer science at college, and in looking over some of his course materials, I've noticed how differently computer science seems to be taught, compared to earlier times (in my case, early 70s, MIT).

Which leads me to wonder, if one wanted to teach/learn the way I did, bottom up, are there any good books (textbook format or otherwise), lecture notes, online tutorials, etc. that one might use to proceed from:
- math, theory of computation background
- basic hardware concepts -> computer architecture
- machine language, assembly language (including assembler and linker concepts
- higher level languages (including compiler and interpreter concepts/designs)
- algorithms and problem analysis/solution
- operating systems
- databases
- networking
- systems
- systems of systems

A few old standards come to mind:
- Structure and Interpretation of Computer Programs (Sussman, et. al.)
- The C Programming Language & The Unix Programming Environment (Kernighan and Pike) - still find these useful occasionally
- Knuth (obviously)
- Comer, Tannenbaum, Stallings for networking

I'm drawing a blank, though, when it comes to "what's under the hood" - going bottom up through basic computing concepts, hardware architecture, machine and assembly language, assemblers, interpreters, compilers, operating systems.

So a question to folks here: what's stuck with you folks over the years, and/or what have you come across more recently, that might fit the bill?

Miles Fidelman

We're doing it wrong....

Poul-Henning Kamp makes an interesting case in favor of his opinion that CS education in analyzing algorithm complexity, is focusing on a model of computation which is simplified to the point of being stupid and wrong, and that it leads students to real-world failures in writing performant code.

By taking page swapping into account, he has achieved a one to two order-of-magnitude speedup in normal use cases over a "standard, provably optimal" algorithm used throughout computer science. He argues that the model of computer memory as having uniform access time is and has for several decades, been wrong and misleading to the detriment of the students and standards of performance. He's on about virtual memory and disk swapping in particular, but his points are just as valid in a system with a high-speed memory cache fronting a slower main memory, which is essentially all of them. Here is a link to his article:

You're doing it wrong

This is of interest to language developers because most people aren't actually writing those fundamental algorithms anymore. They're using the libraries that the language developers provide. So let's not provide stupid and wrong, eh?

On the other hand, I don't see how his described application (a server) requires the ordering that a tree structure provides. It isn't in the business of efficiently providing lots of dynamic ordered lists of the documents it serves; it's in the business of efficiently providing lots of served documents. If I were attempting to optimize access to documents, I'd be creating a hash table of pointers to these documents rather than a tree of any description, thus never needing to access more than one page (or maybe two for rehashes in the small fraction of cases where hash bucket overflows have happened) in order to find out where the document is kept.

On the gripping hand, his described tree structure is perfect for collections that have to support a lot of insertion and deletion while remaining ordered, so I can see it as having good use to implement, say, database tables.

Go-style defer-recover exception handling is better than try-catch style?

The main advantage of defer-recover exception handling over the try-catch is that it doesn't complicate the control flow of the program. This not only makes programs using defer-recover more readable, but also makes the implementation of this feature in a language much simpler than implementing try-catch exception handling.

On the other hand, the main disadvantage (in Go) is that, it does not allow handling exceptions in place close to where exceptions may arise, and the normal execution has to be resumed in the caller of the function that handles the exception.

Recently I supported Go-style panic/exception handling by defer-recover in my hobby language Dao (it was mentioned here before, but there have been many changes since then). I solved the mentioned problem in Dao by supporting a new type of code block (tentatively called framed block for convenience). Such framed block allows to handle exceptions around the place they arise, and to resume normal execution right after the framed block (namely in the same function where the exception is handled). This is achieved by executing framed blocks in new stack frames, which reuse/share the stack data of their surrounding functions and trigger execution of deferred blocks at exit as normal. You can find more details in a blog I just wrote.

Are there many people here finding Go-style exception handling interesting? I wonder if my improvements could make exception handling by defer-recover more interesting, and better than try-catch?

XML feed