Visual Basic and LINQ

Over the last couple of months, both my existence and my judgments have been questioned several times on my favorite programming languages waterhole :-)

In the mean time, I was busily working with the SQL, XML, C# and the Visual Basic teams on language integrated query, or as it is now called project LINQ. In particular since early this year I am collaborating with Amanda Silver, Paul Vick, and Rob Copeland and Alan Griver on what has become my programming language of choice Visual Basic.

If you look closely at the new features introduced to C# and Visual Basic in the context of LINQ, you will recognize many familiar concepts that are regularly discussed on LTU ranging from monads, to meta-programming, lambda expressions, XML programming, to the relationship between static and dynamic typing.

The LINQ project consists of a base pattern of query operators (compare to the monad primitives) such as Select (map), SelectMany (concatMap), Where (filter), OrderBy (sort), and GroupBy (groupBy) on top of which Visual Basic and C# define query comprehensions (compare to monad comprehensions) that  facilitate querying objects, relational data and XML. The C# syntax for query comprehensions is similar to FLWOR expressions, while the Visual Basic syntax stays close to SQL including aggregation.

In addition to the language extensions and base operators, LINQ provides two supplementary domain-specific APIs namely DLinq (compare to HaskellDB) for SQL relational data access, and XLinq (compare to HaXml) for XML hierarchical data access. Besides query comprehensions, Visual Basic provides deep XML integration with XML literals and XML late binding on top of XLinq (compare to Haskell Server Pages, XMl, Comega).

Both Visual Basic and C# have added several additional language extensions in support of LINQ, including local type inference (the type of local variable declarations are inferred from their initializers), lambda expressions (with type inference), local functions, anonymous types, object initializers, extension methods (static methods that can be called using instance method syntax), and meta-programming via expression trees (compare to type-based quote and quasi-quote).

Visual Basic adds some further enhancements to leverage the fact that it allows static typing where possible and dynamic typing where necessary in the form of relaxed delegates, improved nullable support, dynamic identifiers (makes writing meta-circular interpreters a breeze) and last but not least dynamic interfaces, or as I like to refer to them strong duck typing (compare to simplified qualified types/type classes).

LINQ general website: http://msdn.microsoft.com/netframework/future/linq/
VB9 specific website: http://msdn.microsoft.com/vbasic/future

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

More

From Paul Vick.

Well, There Goes The Neighborhood...

What's the point?

Is this an attempt to recreate relational databases (badly) using objects, or to create an object-oriented database API for relational databases or what?

Does LINQ work with non-Microsoft vendors' databases? Does it support C++? LINQ looks like more of "embrace, extend and extinguish." - an attempt to tie Microsoft toolkit users to Microsoft's SQL Server.

Surely Microsoft must understand that such moves make it easier, not harder, for users to migrate away from Microsoft's toolset?

And could you please be more precise when you use the term "Visual Basic"? Are you talking about the old Visual Basic or about the .NET version of Visual Basic (I don't know what it is called now; I gave up two year ago)? They were different languages even prior to LINQ.

If I were you I would quietly cash in my stock option as soon as possible.

looks interesting to me

I honestly don't see how this is a bad think for MSFT stock options...but I suppose that's for another forum.

As mentioned in the original write-up, this looks like list/monad comprehensions of many functional languages (I guess I'll continue my tradition of mentioning the paper "Comprehending Queries" in every post I make :) ).

Also, when Anders showed 'anonymous types' where attributes from one type can be projected to another type...on the fly (without first defining a class)...it looked exactly like he was using tuples (records actually) of existing functional languages. From what I understand, tuples and records are the basic components on which other features such as linked lists and classes are built...C# 3.0 will introduce them as a very high level component...interesting...makes you wonder how beneficial it would be just to allow raw tuples/records in the language.

I haven't actually written a program in FP yet. It seems by the time I do some reading on it, C# (and may be Java?) will have become 'functional' themselves :).

Another thing I have noticed is that much of the functionality added to Java and C# recently has been of syntactic nature (syntactic 'features' and compiled away to some more fundamental operations). I wonder how fast these mainsream languages will evolve if end users could extend syntax themselves (similar to LISP macros).

That's a lot to comprehend.

That's a lot to comprehend. 8^)

These changes look interesting. I suppose they will take some time to realize the full impact. I hope the tutorials are coming.

-Patrick

RE: There Goes the Neighborhood...

Just wanted to respond to Xeo's questions above. I'm not on the Linq team, but I do know a bit about it. First of all, it's neither an API or an in-memory database system. It's an attempt to merge the data models of current relational and semi-structured data with that of objected oriented languages, so that interfacing between them is less painful. There are many excellent papers discussing the need for this (for example, here's one of Erik's)

Secondly, one of the biggest differences that surprised me about the design of Linq compared to Microsoft research efforts such as C-omega is that the connection to the database is not built directly into the compiler. Instead, it's built as a library using the general meta-programming framework in the language ("expression trees"). DLinq is the library that provides the interface to SQL Server, but there is nothing special about it, and it appears that much effort has been taken to ensure other database vendors can also offer integration just as easily.

And yes, the name "Visual Basic" did not change when the language was overhauled for .NET. VB 6.0 was the last interpreted version, and VB 7.0 and beyond all compile to MSIL.

Metaprogramming?

Can you point to any more documentation for the "general meta-programming framework in the language", please? Does this imply that any end-user could write a DLinq-equivalent DSL, or does it require privileged access to the compiler?

Metaprogramming

The idea is pretty cool I think :-)

There is an object model for fully resolved expressions and delegates. C# then adds a conversion from lambda expressions to these expression trees.

For example, the following declaration will create an expression tree that represents the function (x) => x > 2 and assign it to foo:

  Expression<Func<int, bool>> Foo = (x) => x > 2;
We leverage the fact that lambda expressions can be converted to both delegates and expression trees all over the LINQ framework. For example we have two extensions methods for Where, one that takes a delegate (on the list monad) and one that takes an expression tree (on the query monad):
  public static IEnumerable Where([this] IEnumerable source, Func predicate);
  public static Query Where([this] Query seq, Expression> pred);
The cool thing is that in both case you write
  src.Where((x) x > 2)
and depending on whether src has type IEnumerable or Query, you will get the one or the other overload.

Of course, you can build expression trees yourself by directly calling the underlying System.Expression API.

Hope this helps,
Erik

Metaprogramming - DSL using Linq

Also, if you download the LINQ preview, it comes with a fun sample showing how to implement a DSL with C# 3.0 (in this case, a logic programming language).

LINKS

It might be interesting to compare LINQ with LINKS.

The overall design is, of course different, but there are quite a few shared goals.

Languages and Databases

As LtU readers know, it has been my belief for a long time now that we should strive for better integration between databases (in the general sense of the term) and programmning languages. We discussed several projects trying to do this (e.g., SchemeQL), and I think LINQ is an interesting experiment, of huge scale.

Languages and databases are two of the fundamental abstraction tools programming technology has come up with, and real (semantic preserving) integration between them isn't easy. It can create problematic coupling in systems, result in lock-in etc. But this is an important frontier.

How well LINQ solves the problem, I am still not sure.

Itd be great if people that have a bit of free time, download, experiment, and report back.

You don't need all that to make a DB application.

I think that all the things you mention, although it's nice that they are achievable, they are not really needed. I have successfully programmed many DB applications using nothing more than raw ODBC and JDBC using plain recordsets. Even Hibernate seems an unecessary luxury to me. I never understood why people need these obscure object models, where plain recordsets are more than enough.

I've got to hand it to you though: I've never seen so many buzzwords in one post.

By the way, the more a language is extended with features, the more wrong it gets. In programming languages, less is more...unfortunately, people don't get it, and we end up with big ugly frameworks that everybody thinks they are necessary but nobody knows how to use properly, besides the fact that they add layers upon layers of useless code...

This sounds so familiar

"Higher-order functions? Though it's nice that they are achievable, they are not really needed. I've successfully programmed many applications using nothing more than GOTO. I never understood..."
OK, you can finish the rest in your mind.

Hm.

"Goto" plus creative uses of the stack achieves much the same effect, so the statement is valid. (In a sense.)

You only need higher-order functions if you care about ensuring correctness.

A good PL will implement those as libraries.

The strength of a programming language lies into the fact that all these features can be expressed in terms of the core constructs, and not with extending the compiler. If a compiler needs to be extended to support 'map' (for example), then the language clearly had a problem in the first place.

Where we are

Reading this may help calrify some of the problems current development practices often suffer from. In a nutshell: You either have close coupling between database organization and code (which makes changes in either harder and more costly to do), or you have to give up static checking, the help of development tools etc.

These problems can be overcome, of course. But better language and tools will help.

RE: You don't need all that to make a DB application.

I think that all the things you mention, although it's nice that
they are achievable, they are not really needed. I have successfully
programmed many DB applications using nothing more than raw ODBC

Who didn't? But when it comes to real huge applications (in a software engineering-sense, not in a semi-pro sense) you're gonna want to have at least an O/RM unless your customer(s) pay(s) per hour, have unlimited cash and don't ask questions.

and JDBC using plain recordsets. Even Hibernate seems an unecessary
luxury to me. I never understood why people need these obscure
object models, where plain recordsets are more than enough.

I need to tell you that you have a giant lack of understanding which may have lead to your reserved attitude. You *can* do everything with records sets, basically you can do everything with a machine that is turing-complete but that's not the point. LINQ is quite sophisticated (alas it is not new as its roots lie in the ML type system) and saves a lot time. Using LINQ I am able to finish an application before you even finished the prototype.

I've got to hand it to you though: I've never seen so many buzzwords
in one post.

"Buzzword" means there's no real value behind it. But that's not true. C# 3.0 can infer types from expressions which is really new for C-/Java-like languages and is a huge advantage compared to untyped scripting languages.

By the way, the more a language is extended with features, the more
wrong it gets. In programming languages, less is
more...

You're wrong. C# 3.0 is going in the right direction. The more abstract I am able to express certain problems the easier it is to understand the resulting software architecture. You wouldn't do a DB-Application in x86-Assembler, would you? On every subroutine you had to think about things like register allocation or instruction scheduling. The Compiler does that for you. Without O/RM or LINQ you first create a db scheme, then write code to transfer relational data into objects, resolve references and stuff like that. That's a mechanic task - why would you want to do it yourself if a tool (i.e. Hibernate) can handle that?

unfortunately, people don't get it, and we end up with big ugly
frameworks that everybody thinks they are necessary but nobody knows
how to use properly, besides the fact that they add layers upon
layers of useless code...

If it is useless, drop it. But unless you can give evidence on that you can handle large software projects without even self-written frameworks (which ultimately results in a copy-and-paste architecture, as framework means abstraction per se...) please don't put such nonsense statements on this site. There a lot of people out there earning money for programming db-applications but that doesn't mean they're doing a state-of-the-art job.

Let's try to be concrete

How about making the disucssion more concrete by talking about the specific features added, or the example on the LINQ website?

Comparison

I would suggest splitting up the discussion in two directions:

1- Visual Basic extensions;
2- LINQ functionality.

Since this is Ltu I would agree when putting more emphasis on the first issue. Having said that; we have seen Visual Basic move significantly over the last couple of years, from an interpreted imperative language to a compiled OO language and now to a compiled mixed paradigm language with type inference. Would it be fair to compare VB9 to for example Scala. Both mixed paradigm langaguages targeting JIT compiled runtime systems?

LINQ functionality

LINQ is a language design approach, so I am not sure I understand why you find that to be less on topic for LtU.

As far as I can tell LINQ is

As far as I can tell LINQ is by itself an effort to integrate query facilities into the .NET framework, especially with respect to XML and RDBMS. In order to make effective use of these query facilities VB9 and C#3 have been extended with features, like higher order functions, local type inference etc. These extra features will probably have a larger scope than LINQ, hence the seperation I made into language extensions and LINQ. Ltu being devoted to programming languages and given the fact that the programming language extensions have a larger scope I would opt for looking at the language extensions primarily. But of course that is only a suggestion.

When you put it like that...

..than I agree, of course.

I thought you meant something else.

Related work

The DLinq approach is, at first glance, reminiscent of this work (sorry, can't locate a copy anywhere else). The authors use an object model generated from a relational schema to construct correct query strings, though they don't seem to deal with the results of the query.

On an unrelated note, it will be interesting to see what kind of performance penalty one will have to pay when using DLinq over, say, ADO.NET.

An interview with Bill Gates from PDC 2005

With Jon Udell, LINQ and the reasoning behind it are mentioned.

dynamic identifiers

dynamic identifiers (makes writing meta-circular interpreters a breeze)

A link to an example would be nice...

Dynamic Identifiers


Eval[[o.m(...,a,...)]] = Eval[[o]].(m.Name)(...,Eval[[a]],...)

Dynamic Interfaces

We can have those in C#, too...
public class DynamoTests
{
	public void TestCreateWrapper()
	{
		TestClass testObject = new TestClass();
		ITestInterface testWrapper = (ITestInterface) DynamicInterface.CreateWrapper(
		typeof(ITestInterface), testObject);
		Console.WriteLine(testWrapper.Foo("Hello", 1));
		Console.WriteLine(testWrapper.Bar(6, 9).ToString());
	}
}

public interface ITestInterface
{
	string Foo(string bar, int baz);
	int Bar(int a, int b);
}

public class TestClass
{
	public string Foo(string bar, int baz)
	{
		return bar + baz.ToString();
	}

	public int Bar(int a, int b)
	{
		return a * b;
	}
}

DynamicInterface.CreateWrapper(type, object) takes the supplied interface and uses Reflection.Emit to spit out a dynamically generated class that implements it, converting every method call to a late-bound call on an inner object. An instance of that class is created, initialized with the target object, and returned.

Presumably dynamic interfaces in VB9 are generated at compile time?

Mixed identity

The usual complaint about proxying/wrapping is having two different object identities - testObject and testWrapper. Depending on the application this may or may not be a problem.

CLR changes

I didn't look at anything about LINQ in detail, but I wonder: are there any changes to the CLR involved? I remember that the runtime was changed to support parametric polymorphism (generics in C#), so if it's once again updated to offer better support for lambda expressions and the new language features, maybe it'll be a better environment for functional languages. That would be interesting.

No CLR changes for Linq

No, there are no CLR changes involved in Linq (the preview works on the v2.0 beta2 CLR). It seems likely that we may ship some of the core Linq interfaces with the Orcas (post v2.0) framework, but it's highly unlikely there will be any CLR changes for Linq in the Orcas release. Concepts like "nullable" were added in the v2.0 CLR and C# compiler partly in preperation for Linq.

Database perspective

Michael Rys shares his perspective on LINQ.

Dynamic Interfaces

How close does this bring us to achieving this goal:

Suppose... a set of collection interfaces: Enumerable - supporting iteration; Finite - supporting a size opertaion and Searchable - supporting a contains operation. It is possible to define the contains operation for any finite enumerable collection of elements supporting equality checking... Instead of relying on the class hierarchy to reflect common functionality, which might require restructuring the system when new interfaces are needed... it is possible to declare (via some form of logic programming) that any abstraction that supports the Finite and Enumberable interfaces also supports the Seachable interface. This connection can be done usingthe type system...

(from here)

Have you looked at "views"?

This is one of my long-time wishes for OO programming... Have you ever looked at Scala's views? They're obviously trying to reach this goal as well, and it's a pretty cool technique, although it strikes me as too much of a bolt-on hack to be a really elegant answer.

But there are a lot of questions I have about these kinds of features. How, for example, should these interface extensions be scoped? Statically? Dynamically? Obviously some kind of dynamic scoping is needed, because once an interface is extended, it would need to be available with the new interface when passed to functions and so on. But what should the dynamic scope of the extension look like? The whole program? What if different parts of the program want to add the same extension with different implementations? That doesn't make sense for your example, but it certainly does make sense in other cases...

Incidentally, this is something I've thought about a lot, and it's what I had in mind when I was reading the ContextL paper as well...

Not really

I am aware of them, but haven't studied them yet. You raise very good questions. My hunch is that a solution will be influenced quite a bit by type inference considerations.

More

Reusable Queries

Something I am unsure about wrt types and objects in VB and/or Linq... how "reusable" is a query? Say I define an instance of a class whose method performs a query, then can I apply that method on that single instance to any collection of objects, any XML document, and any database?

Or do I have to define that method using generics and create different concrete classes based on that generic definition, one per kind of collection, document, or database I wish to query?

-Patrick

Simple example

Code: here

webservices

So what affect does Linq have on the 'hide xml from the poor programmer' paradigm of webservices programming? I ask because I recently had to make some example code that worked with a a soap based webservice in the approved microsoft manner and it was the most irritating pointless thing I have had to do all year, and would have been greatly improved if instead of attempting to cast what was returned to an array I could have just taken the xml and gotten at what I needed - but that, in the name of simplicity, was not allowed.

LtU Journalism

Today I went to a presentation by Anders Hejlsberg on LINQ. It was well received by the audience from what I can tell. Hejlsberg described himself as a strong believer in the usefulness of the concepts of the functional programming world. He stressed the importance of side-effect free code for concurrency, etc. He is also very pro static checking

However, he regards the whole new way of thinking a strong negative and expects the concepts slowly creep in into the existing OO-imperative languages.

Thinking of Haskell's semi-closed file handles, (perhaps wrongly) I asked whether or not it was possible to annotate the where clauses strict. But, he didn't seem to understand my question. I hope I haven't formulated it wrong.

dynamic interfaces

i see dynamic interfaces as a way to type check duct typing at compile time.

doing interfaces like this will not allow compile time checking.

ITestInterface testWrapper = (ITestInterface) DynamicInterface.CreateWrapper(
typeof(ITestInterface), testObject);

but if it's integrated like in vb9 yes.
so with type inference and dynamic interfaces you can do the same thing as with dynamic typing but it can be stacticly compiled.
if you dont dont reflexion of course.

im am wrong ?

I'm not sure about dynamic

I'm not sure about dynamic interfaces, but the typed equivalent of duck typing is commonly called structural subtyping. OCaml has it: A class is a subtype of another if it supports the operations of the parent, rather than if it is explicitly declared as so.

dynamic interfaces better than structural subtyping

see Andreas Rossberg code

Nonsensical language

I'd say the language of your choice rather has a severely broken object system (which seems to use a certain amount of nominal typing btw). Here is your example in OCaml:

# class inch d =
  object
    method to_cm = d*.2.54
    method to_inch = d
  end;;
class inch : float -> object method to_cm : float method to_inch : float end
# class cm d =
  object
    method to_inch = d*.0.3937
    method to_cm = d
  end;;
class cm : float -> object method to_cm : float method to_inch : float end
# let a = new inch 10.5;;
val a : inch = <obj>
# let b : cm = a;;
val b : cm = <obj>
# b#to_cm;;
- : float = 26.67

so what is the diffrence

vb9 dynamic interfaces use system.reflection so it's late binding.
Ocaml structural typing is early binding.

O,caml static type system renders runtime type mismatches impossible, and thus obviates the need for runtime type and safety checks that burden the performance of dynamically typed languages, while still guaranteeing runtime safety
OCaml delivers at least 50% of the performance of a decent C compiler,

What about ruby

Linq
var q = c.customers
.where(c => c.city == "London")
.select(c => c.companyName);
Ruby
var q = c.customers
.find_all{ |c| c.city == "London" }
.map{ |c| c.companyName }

The difference

When c is a database table, the LINQ version will perform the computation on the server side.

(Other than that, LINQ is not interesting.)