Binary relations

A while ago I read about an alternative to SQL called something like 'Binary Query Language', which models data in (to me) the more intuitive way, thinking of sets and different types of relations (in the mathematical sense) between those sets, rather than using tables, foreign keys, joins etc like the relational model. It lets you do queries based around these relations using functions and multifunctions in a quite neat almost mathematical way.

Does anyone know what (if anything) became of these ideas? the only reference to BQL I can seem to trawl up through google is on this site, which offers an interesting explanation, some examples and a basic implementation in C++.

What I'm thinking at the moment is how useful would it be to write some kind of abstraction that translates these kind of queries into plain SQL on the fly, having generated a suitable SQL type database schema from the 'binary relations' type model.

I guess there are a lot of Object-Relational database things around at the moment designed for persisting object-oriented classes, but I don't think that's quite the same thing?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Re: Binary Relations

Modeling data as binary relations shows up frequently in semantic data modeling. Take a look at this survey, section 3.3 is on binary models.

You would probably be interested in more recent research in this area. It's difficult to know where to start. If you put any four experts into a room you will get five different modeling languages (paraphrasing an old joke). I recommend ORM. Terry Halpin's book on Information Modeling is very good. The website for the book is www.orm.net.

Translating queries from these languages into SQL has been done. But it has not been commercially viable. No one really cares to use a non-standard language even if it is "better".

OTOH, translating SQL into a binary model has worked well in some cases. SybaseIQ is based on binary relations under the covers. There are others too. A DBMS with this design is usually called vertically partitioned or column-oriented. Moa is a current research project encompassing this.

ORM

I've the second edition of the book "Information Modeling and Relational Databases", it's something to recommend.

My Blog

binary relations / RDF

Um, the relational model is also quite thoroughly "mathematical"; if you want to compare like with like, it may be better to think in terms of relations and relvars rather than tables, and Codd's relational algebra and relational calculus rather than SQL.

The folks over at Database Debunkings seem to be of the opinion that RDF/OWL is a reborn "binary relations" model (brief discussion here). I don't know if they're right about that or not, but I did notice that some of the diagrams on the Andrew Girov site looked distinctly like RDF (directed labelled) graphs. There is work afoot on developing query languages for RDF - see eg. Euler and RIL.

I'm a little bothered by Girov's characterisation of RDBMSs as "record based". Seems like getting off on the wrong foot to me.

Datalog

A table is almost exactly a n-ary mathematical relation over the basic datatypes, or primitive sets. In fact, *most* SQL concepts have a straightforward interpretation in Set Theory. (Even if that gets lost under the many bells and whistles that have added to SQL over the years.)

If you are really interested in exploring different query languages, I highly reconmmend the book Principles of Database and Knowledge-base systems by Jeffrey Ullman (of Hopcroft-Ullman and Aho-Hopcroft-Ullman fame) It covers the Heirarchical, Network, Relational, and Object-Oriented database models, as well as giving you a flavor for the query languages ISBL, QUEL (which is PostgreSQL's original query language), Query-by-Example, QBE, SQL, DBTG, IMS, and OPAL. Sadly, you'd be hard pressed to come up with actual database software to toy with most of these languages.

The rest of the book gets into all the gory details and many of the algorithms used in an actual database implementation. Cool stuff, if you are interested. I've never really understood why relational database software does not have a "logical" schema which would be completely normalized and that all your queries would be based around, and a "physical" schema where the tables could be flexibly de-normalized as appropriate. Instead we get stuck in writing database triggers/stored procedures/extra program code to try to maintain integrity in a denormalized database, and if we want to tweak the schema, we often need to tweak our programs.

One of the more interesting aspects of the book is that it details the link between logic programming languages, such as Prolog, and databases. Basically, it calls databases equivalent to Prolog as a "knowledge base" system, and gives a restriction of Prolog, which Ullman calls "Datalog", that is basically equivalent to SQL. In fact, other than where Ullman touches on the other query languages, Datalog is usually used as the basis of discussion.

But as Prolog is a Turing-complete and database query languages aren't, implementing knowledge-base systems efficiently is a fundamentally harder (and in many cases intractable) problem, whereas database systems are relatively easy. And of course, Turing-complete languages don't have straightforward set-theoretic interpretations.

Generic Persistence for CLP

Manuel Hermenegildo gave a talk at PADL about a technique he calls generic persistence. The way I understood it is that one stores Prolog facts in a file or database between program executions in a transparent way, and this includes support for dynamic updates using assert/retract. In consequence you can use Prolog instead of SQL or a conventional database interface to access a database; if you use an SQL DB as a backing store, then the implementation translates queries into SQL code as necessary, and even optimizes them by producing them statically if possible, and bunching them together to reduce socket bandwidth.

I found it quite interesting, even though I am not entirely clued in about LP and (much less) DBs, since I would much prefer to use Prolog rather than SQL as a language for interacting with databases. I can't find the PADL paper online, but there is a tech report of approximately the same name:

J. Correas, J. M. Gomez, M. Carro, D. Cabeza, M. Hermenegildo. A Generic Persistence Model for CLP Systems. Num. CLIP3/2003.0, 15 pages, Technical University of Madrid, August 2003.