Language combining relational algebra and domain algebra

I'm working on a language (and implementation) that combines elements of relational algebra and domain algebra. For clarity, the relational algebra is about operations with relations as arguments, such as join, union and 'is subset of'. The domain algebra is about iterated tuple-level operations used to select, generate or modify relations, such as '>=', '+' and SUM().

The purpose is similar to the Functional Relational Programming described but not implemented in Out of the Tar Pit.

I have a bibliography of papers and products for some 25 or so relational languages spanning over 30 years, including the Third Manifesto and Tutorial-D, Rel, Alf, Muldis, SIRA-PRISE, BS12, Dataphor, etc. They all emphasise the relational side, and while most have a domain algebra component they do not treat it in any detail. I'm looking for more material on the domain side.

I know of one set of works directly on the subject: Aldat. I would appreciate any links or pointers for additional works in this area, preferably active now or in the recent past.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

two different approaches

From mine experience I concluded that there are two basic approaches when working with sets of data:

1. set builder notation

2. cartesian product, union, intersect and substract set algebra combined with "filter", "map" and "fold" higher order functions

Both ways are complete (by my opinion) and can be used for a new data language. If there exists any other way, I would like to read about it.

Relational Algebra

This sounds like a non-minimal implementation of relational algebra. The set builder gives you projection and restriction (plus other stuff), set operations union and intersection are as they are, and the cross product is a full outer join (you only need one join but there are usually several for convenience).

SQL is

... actually imlpementation of a method number 2. Method number 1. also found a way in a lot of programming languages.

Underestimate SQL?

If you assume you start with a table that contains the domain in a column called j, you can do things like:

[j, 2*j+1 | j <- integers]

select (j, 2*j+1) from integers;

So apart from different syntax I don't see much difference.

in set builder notation...

[edit]

... union of A and B is written like:
[x | x element of A or x element of B]

intersection of A and B is written like:
[x | x element of A and x element of B]

and product of A and B is written as:
[(x, y) | x element of A and y element of B]

SQL syntax uses one expression for map (columns), second expression for domains (from clause), third expression for filter (where clause) and fourth expression for unions (union). I also find expressions "for every" and "for any" very useful in set builder notation which I'd like to see in SQL implementation too.

Overall, I like classic math set builder notation more than SQL implementation. There we have just two expressions, namely "map" part and "filter" part.

If you think closer about set builder notation in the way math describes it, you will conclude that there is some complication in filter part in which domain expressions ("element of x") is mixed up with predicates which filter elements. All implementations of set builder notations in different languages strictly divide domain attributes from filter predicate attributes, but I'd like to see it in a way math defines it.

To have it in a math way, all elements which pass through "filter" part should be held in the same main set (set of all sets). Then all work is to filter that main set and pass it through mapping on the left side to get built set.

SQL actually holds elements in predefined different sets (tables) and there is no notion of "set of all sets" from which we could build other sets only by simply filtering elements, so we have to use extra expressions for defining domains of elements plus one more outer expression for unions if we need them.

I think that the main difference is that set builder notation combines domains and filter predicates in one expression, while SQL clearly divides those into several different expression parts.

Maybe it is just a matter of personal preference, but I like math set builder more. Yet, there is still some more thinking to do for holding elements in subsets before actually implementing a database language. If only we had a way to hold all different records in the same table... oh, wait, there is a solution: read this if you're interested in the subject. If anyone is interested for actual implementation of database specific language that operates on described forms of data, I'd like very much to hear your opinions.

Completeness vs Consistency?

I thought that was a good summary of the differences. The comment regarding the set of all sets was particularly insightful. It seems that you could define the Russel paradox in set builder but not in SQL? Whether this is a good or bad thing depends on the preference for consistency vs completeness.

SQL

Is SQL an example of the kind of language you are interested in? It seems to fulfil the requirements of both relational and domain algebras (if I understand what you mean by them).

To clarify by domain algebra you mean the (natual) algebra associated with a data type. For example a Boolean 'column' would require Boolean algebra?

There is an interesting point here about the fact it is a 'Boolean' value not a true/false value, as there are many logics that can operate on truthfulness, for example a Heyting algebra. Presumably we would call the 'column' associated a Heytingean?

al jebr vs. calculus

The Alf missives say that SQL is a declarative calculus which is one of the reasons it sucks so much for us regular programmers these days, whereas a relational al jebr is much more composeable.

Declarative is good.

Declarative languages are supposed to be clearer and easier to understand. I don't think actually writing code to do the equivalent of a complex SQL query using loops/recursion would be in any way pleasant. LINQ for C# is proof of the elegance and power of embedding relations into the host language (which is an embedding of a subset of SQL).

SQL is an implementation of relational algebra not a calculus as I understand it. It is certainly a complete algebra, but it does allow some calculus like expressions, but I don't think it is complete.

Not declarative

SQL is not really declarative, but it isn't procedural either. A relational programming language certainly wouldn't have explicit loops and yes, LINQ is a good example of the power of such a thing.

SQL queries and schemas are declarative

The SQL 'create-table' syntax seems to be a declarative schema of the database, and SQL select statements seem to be declaratively be defining views of the data.

It does seem that Insert/Update/Delete are not declarative, but I don't think they are part of the relational algebra anyway, as that does not deal with mutating relations as far as i can remember.

So far as SQL covers the original posting (relational algebra + domain algebra) the two parts of SQL that seem directly relevant, schemas and queries are declarative.

Composable matters

Yes, an interesting language would probably aim at better composability than SQL (which is pretty bad).

Yes, SQL is probably the

Yes, SQL is probably the best known. The domain algebra (or attribute algebra) is the the bit that deals with columns as the relational algebra deals with tables. The languages I am interested in would probably seek to replace SQL.

Alloy

Alloy (http://alloy.mit.edu) is another great example of a relational language. It also appears to emphasize the relational side, but since it's well documented, it may be worth checking out.

I'm on the same boat (but

I'm on the same boat (but I'm a noob in this area of build languages). Do you have the code open source? I'm doing a prototype in F#, and with each day I start to form a clear idea in how do this.

Also, can interest you to see the DBase family (specially foxpro), that is as close to successfully and popular implementation of a database-first language yet... is the one I wish to replicate