LtU Forum

Unordered pairs and their representation

Occasionally situations arise where a type of unordered pairs would be very handy.

Perhaps the most important example is for representing edges in undirected graphs.

I came across another example that is instructional in a paper by C.J. Date entitled "The Primacy of Primary Keys: An Investigation". He wonders how to define a relation to store information about marriages, under the definition that a marriage is between two people. He gives a definition like:

MARRIAGE ( HUSBAND, WIFE, DATE_OF_MARRIAGE )
    CANDIDATE KEY ( HUSBAND, DATE_OF_MARRIAGE )
    CANDIDATE KEY ( WIFE, DATE_OF_MARRIAGE )

and raises the issue that there is no reason to prefer designating one key or the other as primary. If you amend the example so that there are not distinguished roles like "husband" or "wife" that could provide sensible labels for two separate attributes each storing a single person identifier, then this seems to be a natural spot for an unordered pair type. (Please refrain from discussing the politics of this definition, there are plenty of places on the internet for that.)

I set about thinking of how to make such a type and what its interface could be. I ended up with nearly the exact same interface I had for general sets. I am working on a language for expressing database queries, so I want to strictly control non-determinacy. (Adopting Haskell concrete syntax for purposes of communication...) I have a member :: a -> UnorderedPair a -> Bool containment test, an instance Eq a => Eq (UnorderedPair a), and a function like elements :: UnorderedPair a -> Set a. There's no pattern matching on unordered pairs because it would be so easy to accidentally shoot yourself in the foot with it, and no pattern matching on sets either. There is toList :: Set a -> NonDeterministic [a] though if you need an escape hatch. I also have an instance Ord a => Ord (UnorderedPair a) and inorder :: Ord a => UnorderedPair a -> (a, a).

Overall I'm reasonably happy with this. But I got to wondering whether it would be better to have a more general facility and view unordered pairs as a special case of it. You can look at them as the quotient of pairs with respect to an equivalence relation that permutes the ordering, and then you get unordered triples and ... too, if you can think of anything to use them for. Or you can look at them as a refinement type of sets with cardinality 2.

Those definitions of course differ on whether (3,3) :: UnorderedPair Integer though, and it seems like there may be applications for both answers. A fair amount of literature discusses undirected graphs with loops edges from one vertex to itself. But at least as much literature discusses graphs where such edges are disallowed, and disallowing them seems more natural for the marriage example. I suppose with general purpose refinement types you could look at sets with cardinality in [1 : 2], having your cake and also eating it.

The quotient type approach struggles with one of my goals, because presumably the pattern matching function that applies on ordered pairs would still apply to the quotient unordered pairs without flagging the result as non-deterministic.

If you take the ad hoc approach (neither quotients nor refinement types) it is difficult to think of names for the unordered pairs with and without allowing the degenerate pairs relating a value to itself, so that is another challenge.

I wasn't able to dig up too many examples through search engines, does anyone have thoughts on the best approach to modeling these?

texts & otherwise for bottom up teaching/learning

Hi Folks,

My son is now studying computer science at college, and in looking over some of his course materials, I've noticed how differently computer science seems to be taught, compared to earlier times (in my case, early 70s, MIT).

Which leads me to wonder, if one wanted to teach/learn the way I did, bottom up, are there any good books (textbook format or otherwise), lecture notes, online tutorials, etc. that one might use to proceed from:
- math, theory of computation background
- basic hardware concepts -> computer architecture
- machine language, assembly language (including assembler and linker concepts
- higher level languages (including compiler and interpreter concepts/designs)
- algorithms and problem analysis/solution
- operating systems
- databases
- networking
- systems
- systems of systems

A few old standards come to mind:
- Structure and Interpretation of Computer Programs (Sussman, et. al.)
- The C Programming Language & The Unix Programming Environment (Kernighan and Pike) - still find these useful occasionally
- Knuth (obviously)
- Comer, Tannenbaum, Stallings for networking

I'm drawing a blank, though, when it comes to "what's under the hood" - going bottom up through basic computing concepts, hardware architecture, machine and assembly language, assemblers, interpreters, compilers, operating systems.

So a question to folks here: what's stuck with you folks over the years, and/or what have you come across more recently, that might fit the bill?

Miles Fidelman

We're doing it wrong....

Poul-Henning Kamp makes an interesting case in favor of his opinion that CS education in analyzing algorithm complexity, is focusing on a model of computation which is simplified to the point of being stupid and wrong, and that it leads students to real-world failures in writing performant code.

By taking page swapping into account, he has achieved a one to two order-of-magnitude speedup in normal use cases over a "standard, provably optimal" algorithm used throughout computer science. He argues that the model of computer memory as having uniform access time is and has for several decades, been wrong and misleading to the detriment of the students and standards of performance. He's on about virtual memory and disk swapping in particular, but his points are just as valid in a system with a high-speed memory cache fronting a slower main memory, which is essentially all of them. Here is a link to his article:

You're doing it wrong

This is of interest to language developers because most people aren't actually writing those fundamental algorithms anymore. They're using the libraries that the language developers provide. So let's not provide stupid and wrong, eh?

On the other hand, I don't see how his described application (a server) requires the ordering that a tree structure provides. It isn't in the business of efficiently providing lots of dynamic ordered lists of the documents it serves; it's in the business of efficiently providing lots of served documents. If I were attempting to optimize access to documents, I'd be creating a hash table of pointers to these documents rather than a tree of any description, thus never needing to access more than one page (or maybe two for rehashes in the small fraction of cases where hash bucket overflows have happened) in order to find out where the document is kept.

On the gripping hand, his described tree structure is perfect for collections that have to support a lot of insertion and deletion while remaining ordered, so I can see it as having good use to implement, say, database tables.

Go-style defer-recover exception handling is better than try-catch style?

The main advantage of defer-recover exception handling over the try-catch is that it doesn't complicate the control flow of the program. This not only makes programs using defer-recover more readable, but also makes the implementation of this feature in a language much simpler than implementing try-catch exception handling.

On the other hand, the main disadvantage (in Go) is that, it does not allow handling exceptions in place close to where exceptions may arise, and the normal execution has to be resumed in the caller of the function that handles the exception.

Recently I supported Go-style panic/exception handling by defer-recover in my hobby language Dao (it was mentioned here before, but there have been many changes since then). I solved the mentioned problem in Dao by supporting a new type of code block (tentatively called framed block for convenience). Such framed block allows to handle exceptions around the place they arise, and to resume normal execution right after the framed block (namely in the same function where the exception is handled). This is achieved by executing framed blocks in new stack frames, which reuse/share the stack data of their surrounding functions and trigger execution of deferred blocks at exit as normal. You can find more details in a blog I just wrote.

Are there many people here finding Go-style exception handling interesting? I wonder if my improvements could make exception handling by defer-recover more interesting, and better than try-catch?

Typographical extensions for programming languages: breaking out of the ASCII straitjacket

A paper by Paul W. Abrahams circa 1993, abstract:

Using extended typography, we can design programming languages that utilize modem display and input technologies, thus breaking out of the ASCII straitjacket. We assume that a language has three representations: a visual representation that describes its displayed form, an internal representation defined for each implementation, and an interchange representation, expressed in pure ASCII, that is defined across all implementations. Using extended typography we can use distinctive typefaces to indicate keywords, thus removing the need to reserve them, and can introduce a variety of new symbols more meaningful than those used in most current programming languages. One benefit is the possibility of arbitrary user-defined operators. We can also introduce new kinds of brackets and methods of pairing brackets visually. Extended typography also helps to solve the problems of writing programs in languages other than English.

Sorry I couldn't find a non digital library link, but maybe this would help. I'm surprised no one has looked at this topic again since, at least in academia. Thoughts?

Nesting of quasi-quotation

I'm confused by the issue of producing programs using quasi-quotation that themselves use quasi-quotation. Lets use <...> for quotation and ~ for splicing. Consider the expression <...<...~e...>...> where e is some term that produces code. Should that evaluate to the program ...<...e...>... with e spliced in, or should that evaluate to the program ...<...~e...>... that contains a splicing sub-expression? Given the first choice, how would you achieve the behavior of the second choice? And given the second choice, how would you quote a program with a splice outside of a quote? How does that all fit together in a coherent and expressive whole? What about indicating at the splice point to which quotation that splice belongs? Or introduce an escape construct \ so that you can express quoting a splice as <...<...\~e...>...>?

A little challenge

I noticed a fun thing in the Rust manual, a list of languages that influenced rust. So here's the challenge (insiders ineligible): Which features are inspired by which of these languages?

MLWorks Project now BSD licensed

What? MLWorks is an "industrial strength" Standard ML compiler and integrated development environment, developed by Harlequin in the 1990s, recovered in 2013 and open sourced.

Molecule: using monadic and streaming I/O to compose process networks on the JVM

This is a paper we presented at OOPSLA 2012:

Molecule is a domain specific language library embedded in Scala for easing the creation of scalable and modular concurrent applications on the JVM. Concurrent applications are modeled as parallel process networks that exchange information over mobile and type-safe messaging interfaces.

In this paper, we present a concurrent programming environment that combines functional and imperative programming. Using a monad, we structure the sequential or parallel coordination of user-level threads, without JVM modifications or compiler support. Our mobile channel interfaces expose reusable and parallelizable higher-order functions, as if they were streams in a lazily evaluated functional programming language. The support for graceful termination of entire process networks is simplified by integrating channel poisoning with monadic exceptions and resource control. Our runtime and system-level interfaces leverage message batching and a novel flow parallel scheduler to limit expensive context switches in multicore environments. We illustrate the expressiveness and performance benefits on a 24-core AMD Opteron machine with three classical examples: a thread ring, a genuine prime sieve and a chameneos-redux.

We have been asked several times if the code was open source during the reviews but unfortunately we did not have the permission to share the code at the time. Now that we have it, I'm pleased to announce that the sources of the library and the benchmarks are available on GitHub. There is also a mailing-list for project specific questions.

A retro-active programming language

I've been busy developing the SPREAD language. After a lot of tinkering, I'm pretty much done with the semantics. I believe the result can be considered 'kind of weird' but very powerful.

I have a hard time classifying the SPREAD language - may be you can help me classify it? The best classification I've come up with is that SPREAD is 'a retro-active programming language'.

Bold claim: retro-active programming subsumes reactive functional programming, and much more.

Comments are very much appreciated!

XML feed