Lambda the Ultimate

SchemeUnit and SchemeQL: Two Little Languages
started 9/3/2002; 3:41:48 AM - last post 9/14/2002; 11:12:28 AM

Ehud Lamm - SchemeUnit and SchemeQL: Two Little Languages

9/3/2002; 3:41:48 AM (reads: 3181, responses: 23)

SchemeUnit and SchemeQL: Two Little Languages

SchemeUnit and SchemeQL: Two Little Languages. Noel Welsh, Francisco Solsona, Ian Glover.

An interesting exercise in embedding little languages (or are these really frameworks?) in Scheme.

I particularly like the discussion of SQL embedding in sections 3.1 and 3.2. It shows the advnatages that come from thinking at the programming language level.

Posted to Software-Eng by Ehud Lamm on 9/3/02; 3:42:48 AM

Frank Atanassow - Re: SchemeUnit and SchemeQL: Two Little Languages

9/3/2002; 9:23:42 AM (reads: 2237, responses: 0)

I've only perused this paper so far, but it looks interesting. If I understood what I glanced through, I like the functional approach with -/cc functions for cursors. In Haskell I would do this with monadic operators. (I wonder if the ability to call such a continuation more than once is useful, though.)

Also extremely pleasant to see work of this quality from authors in industry.

Leandro - Re: SchemeUnit and SchemeQL: Two Little Languages

9/5/2002; 12:42:58 AM (reads: 2106, responses: 0)

Very interesting indeed. Too bad the authors don't know that SQL isn't really relational... but perhaps that work can serve as a foundation for a work on a functional, truly relational database language, perhaps even superceding Alphora's Dataphor D4, IBM BS12 and Berkeley QUEL.

Noel Welsh - Re: SchemeUnit and SchemeQL: Two Little Languages

9/5/2002; 2:31:51 AM (reads: 2124, responses: 0)

Frank, you're very kind but I don't think we as clever as you make out. The -/cc functions just use the current cursor, a dynamically scoped variable. Linearizing the connection object is an interesting idea but I don't think it buys you anything as part of the point of DBs is concurrent access.

Leandro, I can assure you the authors know that SQL isn't really relational. To quote the paper: "SQL is a complex mix between the relational algebra and the relational calculus". An earlier version of the paper griped about the non-relational character of SQL, referencing famous SQL detractor C. J. Date, but we removed that part. The goal of SchemeQL is to work with existing databases, meaning working with SQL. Other approaches have defined new database query languages. We reference some of this work in the Related Work section:

Structural Recursion as a Query Language (1991)
Val Breazu-Tannen, Peter Buneman, Shamim Naqvi

This is really cool stuff but we don't have the resources to change the world's database systems.

Leandro - Re: SchemeUnit and SchemeQL: Two Little Languages

9/5/2002; 2:56:33 AM (reads: 2118, responses: 2)

Thanks for enlightening me!

Never the less, the paper still calls SQL a relational language by stating that SchemeQL is "a language for manipulating relational databases", and that "SQL is the standard interface to relational databases". Moreover, it's not by any means "implemented by all major […] DBMSs": just for instance, Oracle isn't compliant, not even in its data types. So in the end, the paper helps keep a pernicious piece of propaganda disinformation by SQL vendors.

We don't have the resources to change the world's DBMSs, but a concerted effort to develop new ones could have as big an impact as Codd's original work has had. I'm only aware of one such effort, Alphora's Dataphor which is currently proprietary and MS Windows-only. But it is a valid proof of concept, meaning there could be a free software implementation eventually.

The contribution you could make is educating about relational issues, and your paper would be perfect to help in that, even it only as a side concern. And it wouldn't have an impact on its main goals. Why did you removed the gripe about SQL from it?

Noel Welsh - Re: SchemeUnit and SchemeQL: Two Little Languages

9/5/2002; 5:11:37 AM (reads: 2153, responses: 1)

It's my turn to be enlightened. Dataphor looks very interesting and well worth discussion as a separate topic. All I know about Date's "Third Manifesto" comes from wiki: http://c2.com/cgi/wiki?TheThirdManifesto . This suggests that functional/logic languages are the way to go for databases (reading inbetween the lines).

Anyway, the fact remains that we don't have the resources to develop a new database technology. We could have discussed the manifest deficiencies of SQL in more detail (SchemeQL would be a lot cleaner if SQL was) but we felt that would distract from the main thrust of the paper. If we were to develop a cleaner query language (maybe by building Scheme into Postgres or similar?) that would be an appropriate place to redress our discussion of SQL.

Here's the closest I could find to a reference on the extended set theory mentioned on Wiki: http://xsp.xegesis.org/

Ehud Lamm - Re: SchemeUnit and SchemeQL: Two Little Languages

9/5/2002; 12:15:41 PM (reads: 2199, responses: 0)

From our perspective it may be interesting to distinguish between the execution model of the language (i.e., SQL) and the way the underlying engine (i.e., DBMS) works.

Florian Hars - Re: SchemeUnit and SchemeQL: Two Little Languages

9/6/2002; 7:38:36 AM (reads: 2054, responses: 0)

Noel, There is something which is obviously not the book mentioned on the wiki, but has the same topic, authors and title at http://citeseer.nj.nec.com/darwen95third.html

Leandro - Re: SchemeUnit and SchemeQL: Two Little Languages

9/6/2002; 8:14:22 AM (reads: 2062, responses: 0)

Here's the closest I could find to a reference on the extended set theory mentioned on Wiki: http://xsp.xegesis.org/

Only that "extended set theory" sounds like snake oil to me. If a set is extended, it's not a set anymore.

For instance, SQL tables aren't relations, and one of the reasons for that is that, containing duplicates, they are not sets at all, but bags. Now you can make operations on bags, but it is much more complicated and adds no power at all. Not only that, but SQL doesn't get even bags correct.

Similarly, the relational database management model is based on predicate logic. So is SQL, but with the catch that it is three-valued logic, and also incorrect.

Noel, There is something which is obviously not the book mentioned on the wiki, but has the same topic, authors and title at http://citeseer.nj.nec.com/darwen95third.html

This is the paper, available in Postscript from http://www.acm.org/sigmod/record/issues/9503/manifesto.ps, which was published before the book. The book is basically the paper expanded and explained.

I would suggest reading the book, not only The Third Manifesto but everything you can afford by Christopher J Date, Hugh Darwen and Fabian Pascal. It's certainly worth it. There is some good material too at http://dmoz.org/Computers/Software/Databases/Relational/Model/, but nothing beats the books.

Chris Rathman - Re: SchemeUnit and SchemeQL: Two Little Languages

9/6/2002; 12:10:35 PM (reads: 2008, responses: 0)

For instance, SQL tables aren't relations, and one of the reasons for that is that, containing duplicates, they are not sets at all, but bags.

Maybe I'm misunderstanding, but doesn't the current slate of databases pretty much allow you to set unique primary keys for any particular table? Is the problem that you believe this unique key must be enforced with no exceptions allowed? Or is it that SQL has not standardized the protocol for defining unique primary keys?

So is SQL, but with the catch that it is three-valued logic, and also incorrect.

I guess I'm just not convinced that substituting a Default Value in place of Nulls accomplishes much.

Theoretical arguments aside, though, is there a description of the D language referenced in the Manifesto. Sifting through Dataphor's site, I see a snippet here and there, but I can't really get a feel for the overall language.

Alex Peake - Re: SchemeUnit and SchemeQL: Two Little Languages

9/6/2002; 7:27:58 PM (reads: 2004, responses: 0)

Chris,

If you really want a "no holds barred" view of just how wrong you are see:

http://www.dbdebunk.com/

Alex

Alex Peake - Re: SchemeUnit and SchemeQL: Two Little Languages

9/6/2002; 8:01:44 PM (reads: 1994, responses: 0)

Frank and Noel,

You both refer to "cursors" with some kind of glee. They are (mostly) considered harmful - from a scalability and performance perspective. They maintain unnecessary locks on the database. Perhaps you refer to some kind of reference into a disconnected relation?

Alex

P.S. Oops! I should have read the paper more carefully. Sorry!

Chris Rathman - Re: SchemeUnit and SchemeQL: Two Little Languages

9/7/2002; 12:08:28 AM (reads: 2010, responses: 0)

If you really want a "no holds barred" view of just how wrong you are see:

I have to futz with SQL day after day and find it both useful and frustrating. My problems with SQL, at least as I see them, have little to do with the possibility of duplicate rows or the existance of Nulls. Most of my problems with databases centers around three recurring themes:

a). Trees: There's a lot of data that is hierarchical in nature. Traversing trees in SQL is a pain. How this fits into Date's and Pascal's view I haven't figured out. They allude to xml at points, but I can't figure out whether they think it's unnecessary or whether they think a database that is true to the Relational Model solves the problem. (Arguments that hierarchical databases were proven invalid in the 60's don't hold much weight in my view. Nor do arguments that hierarchical data is rare).

b). Temporal: Much of the data I come across is temporal in nature. What was the value last week? What is the value now? What is the value one week from now? What was the value last week for what I thought it would be this week? Data is by no means static, nor necessarily correct at any point in time. Many of the computations require not just a single value, but rather must factor in when previous values that were incorrect and project what the values must be in the future (which may possibly be wrong as it stands today). Anyhow, most of the databases are geared towards a single value, not one that fluctuates over time - e.g. what's my salary today vs. what was my salary for the first 6 months this year, what is my salary for the last six months, and did the database accurately reflect that information at each point in time?

c). Incomplete information: Just the nature of the beast, but business rules are not always transparent. As much as you try to cover all the possible variations, there's always some hidden knowledge that is discovered well past the design time. Hopefully, you design the model to be flexible enuf to cope with the unknown but it always amounts to a tradeoff - with greater flexibility, complexity always seems to be a natural result. Data and Models are just approximations of things. How you cope with unforeseen things, many times differentiates success or failure. If you Model too early, the exceptions to the rules will eat your lunch. If you wait around for perfect information about the processes, you will be waiting for a long time.

+++

I suppose I agree with the relational experts that OO databases aren't a necessity from my standpoint - better tools for Object-to-Relational mapping are likely to bear more fruit. I also agree that SQL has a number of limitations and they crop up constantly. And I would also agree that a declarative programming language would probably be a better fit for the relational model.

However, much of the argument of the links cited here focus too much on cries of purity - saying that SQL is not relational is meaningless - unless, of course, you can demonstrate how these shortcomings actually make our life more complicated than it needs to be.

What I'd like to see is the actual programming language involved. SQL may be impure and a hack, but it's pretty useful for a large number of problems - and relatively simple to use. I'm much more productive using SQL than I am in using Prolog to have my data questions answered - even though there are some types of questions where Prolog provides a more concise framework for questions - especially when it comes to handling trees. On most of the mundane questions I ask the database, though, SQL is usually up to the task.

Instead of indicting SQL for being impure, I'd rather see the results of programming language that makes dealing with data easier - and not just for a certain class of problems.

Alex Peake - Re: SchemeUnit and SchemeQL: Two Little Languages

9/7/2002; 9:04:45 AM (reads: 1975, responses: 0)

Chris,

To answer your qestions fully, you really need to read "The Third Manifesto" (Date & Darwen), "Practical Issues in Database Management" (Pascal) and many others. The web site above (http://www.dbdebunk.com/), if you can take the time to read through, will help you see the issues, but the books explain what, why, ...

Re: Trees:

the answer is: "...they think a database that is true to the Relational Model solves the problem..." (see references)

Re: Temporal Data:

see above answer

Re: Incomplete Information:

see above answer

[Quote(Darwen): "...A database is a set of axioms. The response to a query is a theorem. The process of deriving the theorem from the axioms is a proof. The proof is made by manipulating symbols according to agreed mathematical rules. The proof [that, is the query result] is as sound and consistent as the rules are.]

Paraphrasing Pascal: "...a row in a table is a statement of fact, or in logic, propositions about entities of interest asserted to be true..."

Re: What I'd like to see is the actual programming language involved

As mentioned by Leandro above: take a look at Alphora's Dataphor D4

Bottom line:

Really, you have to study before you become a practical database practitioner. At the very least read Pascal's book (above).

Alex

Leandro - Re: SchemeUnit and SchemeQL: Two Little Languages

9/7/2002; 9:07:41 AM (reads: 2062, responses: 0)

> Is the problem that you believe this unique key must be enforced with no exceptions allowed?

Exactly, for some reasons. One is simplicity. There is no need for duplicates in a normalised database, therefore they represent an unnecessary complication.

Linked to the first one, the second is performance. Duplicates represent an additional burden on performance enhancement by query transformers (incorretly AKA optmizers). It will not be possible to get full performance from a database system if the query transformer has to deal with duplicates.

But most important, the relational algebra and calculus are not valid anymore. One could create bag systems, instead of relational ones. The names for such system could be very catchy, like Frodo Baggins or Bilbo Baggins, but even without all the other messy misfeatures of SQL a bag system will never be as simple and logic as a relational one. Anyone who has dealt with users creating or interpreting SQL queries and their results knows how much confusion arises out of duplicates.

> I'm just not convinced that substituting a Default Value in place of Nulls accomplishes much.

I have to agree. But...

First, NULLs in SQL are messy. It does not even qualify as three-valued logic, because there is no consistent logic in operations on SQL NULLs. So even if you want three-valued logic, get a good implementation of it, not SQL.

Second, undifferentiated NULLs are disinformation. Is that missing information? Or not applicable? Or what? Codd originally devised A-marks (missing) and I-marks (inapplicable). That is four-valued logic, very much complex even for technical personnel. A default scheme avoids this confusion by requiring the special values to be defined to each domain as need be.

> [...]is there a description of the D language referenced in the Manifesto.[?]

Yes, in the book. Good investiment. Someone had promised me a BNF definition of Tutorial D several weeks ago to publish in my website, I am still waiting.

> How this fits into Date's and Pascal's view I haven't figured out.

Perhaps because you never cared to browse their books, or understand the fundamentals. Particularly, Pascal has a very simple, logic proposal for representing hierarchical data in his latest book. Good investiment.

> Temporal [data]

I fail to see how temporal data can be better represented or dealt with other than with the relational model. It is inherently complex. But if you do a Google search on Transrelational, you will see that Date and Darwen are co-authoring a book with a Greek professor on precisely temporal relational databases. It is due this year end, cannot wait for it.

But latest edition of Date has already a pretty good introduction to temporal data in relational databases.

> Incomplete information

Your concern is valid. The answer to that, incidentally, is again the relational model. Only with the relational model you can implement first a normalised logical model and then keep changing it according to new information while keeping users and programs access plans and queries fully valid, thru physical and logic schema independence (mapping) and the good use of derived relations.

> [...]unless, of course, you can demonstrate how these shortcomings actually make our life more complicated than it needs to be.

Has been demonstrated. Buy all the books oy Date, Darwen, Pascal and Codd you can get your hands that. Read all the articles, Google and DMoz are your friends. And DBDebunk too. Do your homework.

> What I'd like to see is the actual programming language involved.

So why are not you toying around with Alphora Dataphor yet? Its D4 language has been validated by Hugh Darwen as fully relational. It does not yet deliver the full performance benefits because it is yet implemented as a translation layer over SQL DBMSs, but a full DBMS implementation is in the works.

Also, Tutorial D is well defined enough and has examples enough in books by Date for anyone to see its capabilities.

If someone thinks that I have been making the relational model into a panacea. Nothing is a panacea, but the closest thing to it in logical data modelling is two-valued predicate logic and set theory. You cannot get saner and simpler than that.

Frank Atanassow - Re: SchemeUnit and SchemeQL: Two Little Languages

9/8/2002; 6:45:12 AM (reads: 1937, responses: 0)

You both refer to "cursors" with some kind of glee.

Do I? I think you are reading too much into my post.

They are (mostly) considered harmful - from a scalability and performance perspective. They maintain unnecessary locks on the database.

I don't use cursors or iterators, since I write relatively little imperative code. I agree with your assessment of their disadvantages. However, I nevertheless find it interesting to see such things formalized in a language like Scheme.

Perhaps you refer to some kind of reference into a disconnected relation?

No.

Frank Atanassow - Re: SchemeUnit and SchemeQL: Two Little Languages

9/8/2002; 7:01:01 AM (reads: 1947, responses: 0)

"Extended sets"/XSP look to me like string-indexed sets/functions. Not exactly what I would call an innovation.

It seems to me that people are flinging the word "technology" around a bit too much these days.

Here, let me introduce you to my realtime 3D virtual presence augmentation technology: I call it "vector calculus".

Noel Welsh - Re: SchemeUnit and SchemeQL: Two Little Languages

9/9/2002; 5:41:12 AM (reads: 1928, responses: 0)

Read the 'Third Manifesto' paper over the weekend and it seems to me that D is much in common with a functional language with multiple dispatch. I wonder if Date is aware of research into functional languages? I think he'd like them.

I've also briefly looked over D4 as described in the "Start Here" manual. The documentation is brief but it looks like no higher-order functions, polymorphism etc. If D4 is currently translated to SQL I wonder if we could do make SchemeQL a 'D' and do the same. This would break the 'use previous knowledge'/gentle slope goal but should give us a cleaner implementation.

Bryn Keller - Re: SchemeUnit and SchemeQL: Two Little Languages

9/9/2002; 10:01:58 AM (reads: 1922, responses: 0)

Consider:

To answer your qestions fully, you really need to read "The Third Manifesto" (Date & Darwen), "Practical Issues in Database Management" (Pascal) and many others. The web site above (http://www.dbdebunk.com/), if you can take the time to read through, will help you see the issues, but the books explain what, why, ...

and

Perhaps because you never cared to browse their books, or understand the fundamentals. Particularly, Pascal has a very simple, logic proposal for representing hierarchical data in his latest book. Good investiment.

and

Do your homework.

This sort of rhetoric bordering on rudeness is common on dbdebunk.com, and I think it's really hindered their message. Their answer to every question is "You don't know anything. Buy my book and come back when you know something".

I would guess that Lambda The Ultimate is perhaps the one place on the web most likely to be friendly to the idea of data management backed up by solid theory. If you can't convince people here, you won't convince them anywhere.

Telling anyone who disagrees with you that they need to do their homework convinces no one. Telling them that they need to buy & read multiple books before you'll discuss the issues with them solves nothing.

I don't mean this as a personal attack, I'm just concerned because I don't think dbdebunk.com's rhetoric is appropriate here. This is a community of people who *are* interested in theory, and *want* to know what better ways of thinking about problems. Just tell them what you think, don't tell them to go buy a book and then come back when they've read it.

YMMV and all that, and I promise not to mention it again. Sorry if I've offended anyone.

Leandro - Re: SchemeUnit and SchemeQL: Two Little Languages

9/9/2002; 10:46:01 AM (reads: 1901, responses: 0)

> This sort of rhetoric bordering on rudeness is common on dbdebunk.com, and I think it's really hindered their message.

Agreed about DBDebunk not endearing itself to readers. But I think you are too ready to take offense, and have quoted me out of context. For example, your first quote was an answer to someone who is trying to figure out hierarchical data in relational databases, but didn't buy the books. Was I expected to transcribe the full chapter of a book? I can't do so. I can give a quick summary, like:

Nodes should be one relation, and links another.

Now I can even expand this in several dozens of text lines without finishing clarifications, explanations, expansions, examples etc. And this is all in the book. So why start, unless it really becomes a topic of its own?

> Their answer to every question is "You don't know anything. Buy my book and come back when you know something".

Now I think you are exagerating. How can one educate himself without buying books? And it's difficult to recommend others' books nowadays, since there are only two sane authors left (relationally speaking). OK, three, but two of them always write together.

And the cold fact is that people really know nothing, because everyone else is regurgitating SQL vendors' press releases and OO hype. So one has to shock people into educating themselves. Or quit work and spend all his waking time writing into all kinds of forums for no recompense -- because it is impossible now to find an editor or school that will publish something sound.

> Telling them that they need to buy & read multiple books before you'll discuss the issues with them solves nothing.

Nothing? Nothing if they refuse to buy the books. This would be indicting oneself by showing that US$50 is more important than education. For us to transcribe all the arguments and proofs from the books isn't practical at all, and would be a copyright violation.

One might me too poor to buy such books, but then there are alternatives, like pestering employer or library until they buy it for you. When I couldn't buy the books, I scourged the Net for everything I could, and as a result found the scarce material now at the relevant category at DMoz. Ignorance isn't really an option, and I've been giving according to my small competence. But people can't be expected to write for nothing.

> Sorry if I've offended anyone.

No offense taken, and none intended.

Alex Peake - Re: SchemeUnit and SchemeQL: Two Little Languages

9/9/2002; 10:52:20 AM (reads: 1887, responses: 1)

Bryan,

You are right in that I was too rude in my reply. I apologize to all.

OTOH, the topic is very large, and well covered in the books, so what to do?

Alex

Ehud Lamm - Re: SchemeUnit and SchemeQL: Two Little Languages

9/9/2002; 11:35:23 AM (reads: 1935, responses: 0)

Most LtU are well educated, and references are, of course, welcome.

Please also keep in mind that this site is dedicated to programming languages, and try to stay on topic.

Chris Rathman - Re: SchemeUnit and SchemeQL: Two Little Languages

9/9/2002; 3:00:44 PM (reads: 1877, responses: 0)

Snippets of quotes here and there - in no logical order.

This is a community of people who *are* interested in theory, and *want* to know what better ways of thinking about problems.

I suppose I'm the odd man out, because I'm weak when it comes to theory (I've got a whole stack of books and articles that I've got in the queue to read - so I guess it won't hurt to pile on a couple more). I am open-minded, though, and quite willing to express the limitations of my knowledge - so I'd hope not to put to much crimp on the discussions.

Now I think you are exagerating. How can one educate himself without buying books?

Each of us learns in our own seperate manner. In my experience, the best way to learn things is to teach it to others. A teacher always learns far more than the student. Short of that method, I usually learn things by seeing them in action and then going back to understand the decisions that went into the programming language design at a later point in time (hopefully well before I've shot myself in the foot). This method may very well be crude and wrong-headed (and possibly despised by some), but I'd venture to say that it's a fairly prevalent manner for the great unwashed programmers of the world at large.

Nothing? Nothing if they refuse to buy the books. This would be indicting oneself by showing that US$50 is more important than education.

Such things sound good in a vacuum, but the problem is that they ignore the concept of Opportunity Costs. Would I be better served to spend the time or money on this particular set of books, or perhaps something like say EOPL (or even, gag, a book on SQL for Dummies)? Each time you spend time and money on something, you are, by definition not spending it on something else.

For us to transcribe all the arguments and proofs from the books isn't practical at all, and would be a copyright violation.

I can only figure out a couple of possibilities. First, would be that you think that passing knowledge via discussion groups is too constraining (with which I'd agree that it has severe limitations but also has its purposes). Second possibility would be that you will not enter into a discussion with a person that does not have a requisite threshold of knowledge on a particular subject that you are interested in (conserving your time and resources to maximize your utility).

For example, your first quote was an answer to someone who is trying to figure out hierarchical data in relational databases, but didn't buy the books. Was I expected to transcribe the full chapter of a book? I can't do so. I can give a quick summary, like: Nodes should be one relation, and links another.

I wasn't necessarily expecting an answer - just stating an opinion about the 3 things that I find most frustrating with SQL. In the case of Hierarchical and Temporal data, it's quite easy to model the entities and relationships required to store the information. However, SQL is not just a tool for putting data into the (pseudo) relational database - it's also a tool for extracting information. Once you model, in the manner you indicate, you're still faced with the problem of very awkward and clumsy SQL syntax to traverse temporal or a hierarchal tables.

BTW, I did read part of one of Date's books a couple of years ago in which he discussed hierarchical databases. I came away with some good insights, as well as confirming some of my own thoughts and methods - but mostly I've forgotten the specifics of what he recommended, with some of it inseperably integrated into my thoughts (non-attributable at this juncture), whilst other parts of it are irretrievably lost or misunderstood with the haze of time.

For us to transcribe all the arguments and proofs from the books isn't practical at all, and would be a copyright violation.

As you indicate in your post, the amount of information on the web concerning programming languages for relational databases seems to be rather small in comparison to any other environment. Most of the programming languages these days are either (a) Open Source; or (b) Open Specification. For example, it's nice I can go buy the books on Python (which I have), but it's even nicer to know that I can get a wealth of information on the language from any number of sources and perspectives aimed at many different levels of aptitude - most of which are freely shared. I can go to any number of newsgroups or forums and ask the stupidest question about Python and usually have someone give a helpful answer.

Historically speaking, the biggest problem we've had with query languages is their proprietary nature. Every vendor seems to want to keep certain aspects of querying to themselves. SQL may be an open standard but I think it's worse than it should be simply because no one wants to give away their advantage. The vendors need a standard at some level, but they do not want something that takes away the reliance on their particular database. As long as SQL continues to hobble along, they are not too concerned.

Now, you tell me how Dataphor and these other companies fit into this eco-system? It would concern me that even asking a few simple questions would result in my getting hit to spend money (at the risk of being labelled intransigent or unknowleadgeable) before I even have a clue as to what is being promoted. That said, I know it is y'all's intention to be helpful and, yes, spending $$$ on the books might be a good investment - It's just that the prevalence of snake-oil-salesmen in the database market tends to make one skeptical of claims.

Is the problem that you believe this unique key must be enforced with no exceptions allowed?
Exactly, for some reasons. One is simplicity. There is no need for duplicates in a normalised database, therefore they represent an unnecessary complication.

What I'm not understanding is why setting a unique primary key on the table does not solve the problem? Ok, so you are saying that one can cause confusion by not putting a unique constraint on the table - I'd say that's a table design problem. Just because SQL has the facility to be abused by allowing duplicate rows, doesn't mean that you have to use it. As for efficiency, if the table was defined with unique rows, then the vendors ought to be smart enuf to use that optimization on the tables when unique keys have been defined.

Perhaps the problem is not with the physical layer, but rather the logical layer. Although a table can be constrained with unique indexes, I know of no similar facility provided with the query views. Is that where the primary complaint lies?

Anyone who has dealt with users creating or interpreting SQL queries and their results knows how much confusion arises out of duplicates.

It's good to know that one of the purposes of the language is to reduce confusion amongst the neophytes. As someone who's is just above the ranks of rookie, but well below the level of expert, I don't find the argument convincing - but perhaps others do.

[Quote(Darwen): "...A database is a set of axioms. The response to a query is a theorem. The process of deriving the theorem from the axioms is a proof. The proof is made by manipulating symbols according to agreed mathematical rules. The proof [that, is the query result] is as sound and consistent as the rules are.]
Paraphrasing Pascal: "...a row in a table is a statement of fact, or in logic, propositions about entities of interest asserted to be true..."

Which would confirm the fact that a declarative programming language would fit in well with relational databases. SQL itself is a form of declarative programming, though with a somewhat flawed implementation. Once you acknowledge the fact that (a) SQL is flawed; and (b) a declarative programming should be used, you are still faced with the problem of actually implementing the programming language. As with any programming language, there are tradeoffs in speed, flexibility, consistency, and ease of use for the targetted domain and audience.

I suppose it's theoretically possible to design a perfect programming language (ask any Scheme afficionado :-) but still there's quite a distance between this and the guidelines used in design and the actual implementation. A language that is solidly grounded in theory will usually avoid the glaringly obvious pitfalls of language design. But that doesn't mean that a perfect language will be an obvious fallout - mostly necessary but not sufficient.

Anyhow, I'd prefer to see some language examples that gives some of the isomorphisms between D (or any True RDBMS) and SQL. I have limited understand SQL, warts and all, so that's as good a starting point as any (at least for myself). Instead of harping on the shortcomings of SQL, it'd be nice just to see how some standard SQL constructs are mapped into a new declarative languages. My original question, the only one that has much meaning to me is whether there are some language examples on the web for the D language?

And yes, these questions are here because I'm cheap and lazy. Too cheap to buy the books, and too lazy to download the sofware - at least at this point in time. I suppose I would justify myself with the addage that the secret of a good programmer is to always pick the optimum level of laziness. So I'd probably understand if y'all want to match my admitted laziness with your own optimum level. :-)

Leandro - Re: SchemeUnit and SchemeQL: Two Little Languages

9/14/2002; 11:12:28 AM (reads: 1841, responses: 0)

> Read the 'Third Manifesto' paper over the weekend and it seems to me that D is much in common with a functional language with multiple dispatch. I wonder if Date is aware of research into functional languages?

Yes, he is. I just guess he is too busy to pay proper attention to them, with temporal applications in D, writing books and so on. But he does mention in one of the several versions of The Third Manifesto -- a paper, two editions -- that it could have interesting renditions in functional languages, or something the like.

> If D4 is currently translated to SQL I wonder if we could do make SchemeQL a 'D' and do the same. This would break the 'use previous knowledge'/gentle slope goal but should give us a cleaner implementation.

Precisely, that was my original point. BTW, see Hugh Darwen's new site, http://www.thethirdmanifesto.com/. Not much there, but there is another D language defined by a student, sample chapters and other interesting stuff.

> Each of us learns in our own seperate manner.

Yes, I know that, my wife, my parents and my in-laws are teachers. But unfortunately, there is no way to get proper education without some labour. In computing, specifically in algorithms and databases, there is no way to really understand what is going on without some theoretical basis. More specifically, there is not yet a DBMS where one can see a good example of a valid D. Alphora Dataphor D4 is a valid D, but by virtue of being .Net it is still MS-W32 only, and not yet a full DBMS. And even if one does gets a real RDBMS to play with, without understanding the the relational model, more specifically the underlying principles of predicate logic, set theory, and normalisation, the chances are that one will produce a very bad DB, and then compensating by coding around its original badness. That is what we see everywhere SQL is used.

To sum it up, there is still no substitute for real education, and there will never be.

> you think that passing knowledge via discussion groups is too constraining

Most obviously. That is what I said, not otherwise.

> just stating an opinion about the 3 things that I find most frustrating with SQL

But then you are missing the point that SQL is not relational, but imposes several unreasonable arbitrary restrictions because it violates so many fundaments of the model. This is a point that Date, Darwen, Pascal and Codd never ceased to make -- to be precise, Codd did cease, but that is because he has been severily sick for several years now.

> Now, you tell me how Dataphor and these other companies fit into this eco-system? It would concern me that even asking a few simple questions would result in my getting hit to spend money (at the risk of being labelled intransigent or unknowleadgeable) before I even have a clue as to what is being promoted.

I am trying to make Nathan Alan, the creator of Dataphor and D4, to see the light of free software. Meanwhile, one can get a full-functioning eval copy by requesting it at http://www.alphora.com/tiern.asp?ID=EVAL. Documentation is not yet complete, but I think that interoperable implementations would already be possible -- in fact, there are already three different Ds, namely Tutorial D, Dataphor D4, and D^d, and they should be interoperable too.

Also, even the absence of implementations, it is my personal belief grown from hard-won experience that one should not be allowed to model data or write data manipulation statements and programs before understanding the relational model, because this will surely make for far more complicated, unreliable systems.

> What I'm not understanding is why setting a unique primary key on the table does not solve the problem?

Because the language itself has already became more complicated than necessary, the optimizer became less capable and more complicated, and users unenlightened. Again, one must do his own homework, even on the Web, say http://dbdebunk.com./ there is too much material on this for me to reproduce here. Suffice it to say that the system has to deal with bags instead of sets.

> Perhaps the problem is not with the physical layer, but rather the logical layer. Although a table can be constrained with unique indexes, I know of no similar facility provided with the query views.

You touch another sore point. By supporting duplicates, and thus bags instead of sets, SQL cannot have updateable views, because they are not as predictable as derived relations would be. Again, I must refer you to the books.

> It's good to know that one of the purposes of the language is to reduce confusion amongst the neophytes.

Not only neophites, unfortunately. Even experienced programmers and DBAs frequently fail to ever grasp the basic model of relational databases. And even people as myself stumble on all the unnecessary complexity of SQL and bags. Probably Intel would sell less, cheaper CPUs, and Micron less memory, if we had relational systems, because we would have far more capable, simpler, faster applications and databases.

> I'd prefer to see some language examples that gives some of the isomorphisms between D (or any True RDBMS) and SQL.

Now this is your homework.