Lambda the Ultimate

inactiveTopic Udell on licensing and lock-in
started 1/23/2004; 2:46:20 PM - last post 1/24/2004; 7:52:00 AM
Ehud Lamm - Udell on licensing and lock-in  blueArrow
1/23/2004; 2:46:20 PM (reads: 10054, responses: 8)
Udell on licensing and lock-in
Jon Udell has an interesting item on open-source lock-in inspired partly by a change in the licensing terms of MySQL (LGLP to GPL for the API).

Two points may be of interest to the LtU community.

Jon quotes Kingsley Idehen saying You should never find yourself locked into any database vendor, programming language vendor, operating system vendor, or business application vendor.... But programming languages shouldn't really be in this list. A programming language should have a complete and accurate public standard, eliminating the danger of vendor lock-in. This is important for many reasons (getting a language 'right' is awfully hard, a public standard goes a long way in ensuring that design errors are caught in time), among them the issues Jon discusses. Think about this a bit, and then turn your mind to VB if you will...

The second issue is more intersting technically. It is about writing database access libraries (APIs) using mainstream programming languages. This is an issue I mentioned here quite a few times.

Simply hosting (i.e., embedding SQL) as text inside programs is bad for a mlutitude of reasons (think compile time checking), but it's the standard approach, and we have years of experience using it.

The crucial problem is that if you want to create a database access layer as part of the design of your project, you don't have many choices for the design of the interface between your access layer (which hides the actual SQL etc.) and the rest of your system. You can use explicit iterators or cursors, which hurt the elegance of the code, return full tables which hurts performance (and when optimized so as to produce small tables, hurts flexibility and modifiability), or implement a DSL like protocol between the access layer and its clients complicating the design and requiring dynamic SQL.

Some programming languages have facilities that can ease this sort of programming problem which is quite common. I am thinking about things like laziness (and generators) and macros. Perhaps you have more suggestions.

It would be intersting to hear Erik's thougts on this matter given his unique perspective. Noel can also enlighten us, based on his experience with SchemeQL.


Posted to Software-Eng by Ehud Lamm on 1/23/04; 2:50:20 PM

Luke Gorrie - Re: Udell on licensing and lock-in  blueArrow
1/23/2004; 3:39:01 PM (reads: 503, responses: 1)
[This comment intentionally left blank.]

Luke Gorrie - Re: Udell on licensing and lock-in  blueArrow
1/23/2004; 3:43:33 PM (reads: 492, responses: 0)
Working with the Mnesia database in Erlang is a totally different approach than RDBMS.

The database is written in Erlang, runs in the same OS process as the application, and stores Erlang terms natively (each row is a tuple). The data model is just like a hashtable or a tree, and because there is no latency you can just write your queries as straight Erlang code and give Mnesia a closure to execute with transaction semantics.

I think that Mnesia is one of the best programs in the world.

We use Mnesia for the configuration database in our system, and put a few hooks to good use. For one thing, the database itself notifies the appropriate processes when their configuration is changed. For another, we use a hook right before the end of each transaction to call our consistency-checker. If it finds an inconsistency, the database is transaction gets rolled back, so our "valid configuration" invariants are strongly enforced. (Can only be broken by redefining the invariants.)

Consistency rules are written either in straight Erlang or in a small DSL. The rules are always deterministic with respect to the configuration database, and we record precise dependency information for each of them to avoid rechecking properties that cannot have been broken.

When a rule specified with the DSL is broken, an explanation of the error is automatically generated from the rule text. I enjoy some local infamy for how very badly I wrote the text generator :-). This week a colleague showed me this gem:

Invalid setting for /cfg/ssl/server 1/ssl/cacerts.
Invalid because neither its value ([]) is not [] nor Verify (require) is not
require

What can I say? It looked so easy in Steele's thesis. That particular feature has been deprecated. :-)

Ehud Lamm - Re: Udell on licensing and lock-in  blueArrow
1/23/2004; 3:49:44 PM (reads: 495, responses: 0)
Yes, closures and continuation can be useful. I didn't want to rub it in, so I didn't write about these in my original post...

About Mnesia: If I understand correctly, you connect to the DB using standard erlang message passing, right?

Can the DB live on a different machine (distributed app)? Can erlang closures go over the wire?

Luke Gorrie - Re: Udell on licensing and lock-in  blueArrow
1/23/2004; 4:08:49 PM (reads: 481, responses: 0)
The DB is just another collection of Erlang processes, so yes you talk to it with message-passing. If you want to talk to a Mnesia in a different Unix process or different machine, you can do it transparently with Erlang's regular distribution method.

Our system is a line of clusterable internet appliances with a so-called "single system image". That is to say, each appliance runs Mnesia with full replication of the configuration of the whole cluster, and any configuration change is applied globally/atomically to all machines. If you have a lot of machines in the cluster, we nominate some as "masters" to participate in transactions, while the rest are "slaves" that just have the results of transactions pushed to them. If a slave needs to perform a transaction involving writes then it just sends the closure to its master via distribution.

So yes, closures go over the wire, but it requires you to have the same code on each machine (only a reference to the code is sent). This is the norm in Erlang clusters - they tend to be tightly coupled and have automated global/consistent software upgrades to keep them in sync.

If you really need to send a closure between two machines running different code, there is a fun trick. Probably they both have the same version of the Erlang interpreter (as used by the shell/REPL), so you can just pass a closure over its eval function that includes the syntax tree of the code-version-dependent expression you want to evaluate. (More practically, you can pull up a remote REPL via distribution, send the code you need over in a message, and patch it in.)

Avi Bryant - Re: Udell on licensing and lock-in  blueArrow
1/23/2004; 5:40:05 PM (reads: 461, responses: 0)
I've been playing with a database DSL of sorts in Smalltalk, which provides a lazy way of accessing relational databases by modelling the relational algebra as Smalltalk expressions and then generating corresponding SQL queries as needed. I wrote a little bit about this here: http://www.cincomsmalltalk.com/userblogs/avi/blogView?entry=3246121322

I haven't used it in anger myself, but I know some people that have and were quite happy with it.

Patrick Logan - Re: Udell on licensing and lock-in  blueArrow
1/24/2004; 7:52:00 AM (reads: 356, responses: 0)
I vote for SchemeQL, ROE (Avi's), Mnesia, and Xen, all four. Right here you have probably the four most interesting database/language activities of recent years. (Any more? There's a Haskell system comparable to SchemeQL called...)

I'll always have a fondness for Gemstone/S but GS/S never had a satisfactory solution to table integration. There are two ways I would have wanted to integrate ROE. One is that GS/S had its own special block syntax for structural access (PDF). This could have been ROE instead. The other is to have integrated ROE itself into GS/S for it's access to SQL databases.