Structural Typing in .NET through Type-Parameters

I have a question associated to some functionality I want to implement in a compiler for .NET. I originally called this 'duck typing' but after reading up more on type systems, that's incorrect.

The basic idea follows what's called structural typing; the main aim is to implement such a thing through type-parameter extensions. Naturally since this isn't something implemented at the low-level Common Intermediate Language, I'll have to handle the functionality myself.

Essentially you would define the requirements of a given type-parameter on the constraints of that type.

The information would be encoded within a series of interfaces which represent the various member constraints. Call sites that utilize the functionality would emit a type for every generic closure (a specific set of types closed in a specific instance vs. the full set of possible types, is this term proper?) of the generic type they use. I could go into details of how this will be handled on a functional level, but that's for a different discussion.

The question I have is: is structural typing useful for .NET Languages? The concept would be statically typed in that the compiler generates the corresponding types to handle call dispatch. Assemblies built using a constraint-ignorant compiler (like C#) would be processed at run-time when the first call is made. This isn't 'dynamic' typing for the obvious fact that no change to the run-time state of the assembly can affect the object-structure linking and subsequent call dispatch.

Insight welcome.

Generally before

Generally, before implementing a language feature you want some use cases, or some examples. So I suggest you to show several small examples of C# code, each one followed by hypothetical code that uses your idea, using some simple syntax.

By leonardo m at Sat, 2011-07-09 15:13 | login or register to post comments

Re: Examples

A common example of why you'd ever want to use structural typing is when you're dealing with a large series of objects that have nothing but their structure in common.

A silly, but relevant, example is if you wanted to print the name of various series of objects:

public static void PrintNames<T>(IEnumerable<T> series)
    where T
    [ //'[' versus '{' for obvious ambiguity reasons.
        string Name { get; }
    ]
{
    bool first = true;
    foreach (var item in series)
    {
        if (first)
            first = false;
        else
            Console.Write(", ");
        Console.Write(item.Name);
    }
    Console.WriteLine();
}

Granted that's just pseudo code at this point. It demonstrates the idea of focusing on the structure, versus the concrete type.

This doesn't mean you could pass a series of objects (IEnumerable<object>) to the method and have it work, it's not dynamic in that it looks up the individual object types during run-time, its intent is to be statically typed (you'd pass multiple sets, IEnumerable<Control>, IEnumerable<Type>, et cetera). So long as the various types you send it conform to the requirement of having a property name, it would work.

The functionality necessary to handle the call would likely be marshaled through a compiler-generated implementation of an interface that represents that type-parameter's member requirements. The implementation itself would simply redirect the call for that given type.

By Alexander Morou at Sat, 2011-07-09 23:49 | login or register to post comments

Since for this kind of

Since for this kind of construct you're going to have to leverage C#'s grammar/parsing process anyway, I believe there's not so much of a big deal of ambiguity to avoid actually, should you wish to stick to '{' and '}', instead of '[' and ']'. IMO, it might not be too difficult to achieve if you give another context of interpretation to an existing keyword, say, 'this' (*), to make it a lexical marker/token synchronizing your lexer/parser (whatever their nature is) with the new sets of the updated grammar's non-terminals you'd end up injecting there for your purpose (as an optional/extension feature's syntax to appear before the member's defining body, as part of the generic constraints stuff).

E.g.:

public static void PrintNames<T>(IEnumerable<T> series)
    where T : this
    {
        string Name { get; }
    }
{
    bool first = true;
    foreach (var item in series)
    {
        if (first)
            first = false;
        else
            Console.Write(", ");
        Console.Write(item.Name);
    }
    Console.WriteLine();
}

My 2cts.

(*) or whichever else introductory token you see fit

By Cyril at Sun, 2011-07-10 06:12 | login or register to post comments

I think so.

I personally think so, working on a structurally typed .NET language myself; though I'm curious about how your approach will work for significantly large programs.

Scala can target .NET, so it might be useful to look there for prior work.

By RM Schellhas at Sat, 2011-07-09 16:24 | login or register to post comments

Re: Your language

I'm curious to see your approach for implementing this, we might have some overlap.

I'm planning on constructing a series of interfaces relative to the extended functionality, rather than having the user implement them, the compiler will do so and the call-sites that utilize the extended functionality will notify the target generic type of the implementation (i.e. an instance of the class that accepts the specific type 'T' for a given type-parameter)

The static constructor of the types using the closures will register a callback on the target generics, when the extended functionality is needed the first time, it'll invoke the callback and instantiate the relevant implementations for that generic closure. I prefer the callback method because it ensures that if you're using a large series of such functionality, you're only instancing lightweight delegates, versus the series of instances needed to provide the functionality.

Suggestions on how else this could be implemented is appreciated.

Edit: If appropriate, I could generate a small example of what the compiler would generate.

By Alexander Morou at Sat, 2011-07-09 23:36 | login or register to post comments

Ill defined...

Alas I am just a hobbyist poking around in the dark.

Currently I have the structural type system sitting outside of the CLR's bounds. Actual storage is bundles of dynamics with method execution being done by a slim layer that is aware enough of the 'other' type system to do dispatch properly.

By RM Schellhas at Sun, 2011-07-10 02:51 | login or register to post comments

I believe it would be very

I believe it would be very much so:

If appropriate, I could generate a small example of what the compiler would generate.

(that is, to be "appropriate")

Indeed if you can enlight the readers with some possible implementation you have in mind already on this specific example and maybe a couple others, even though it wouldn't need to be regarded as a "reference one" yet at this point, it would at least exhibit for the discussion the intended semantics you're contemplating (or hopefully a good deal thereof), beyond all the syntactical aspects otherwise.

I find the idea interesting, btw. Looking forward to reading more.

By Cyril at Sun, 2011-07-10 05:49 | login or register to post comments

Example.

Here's an example of what the compiler might generate.

It compiles, further, when a method or a type has multiple type-parameters in play, the RegisterBridge method would include a delegate signature for each type parameter, versus a single call for each and every one.

Naturally such a system would need to be adjusted in the case that two type-parameters have constraints that include each other:

public static void Test(T1 test1, T2 test2)
    where T1
    [
        void Test3(T2 test2);
    ]
    where T2
    [
        void Test4(T1 test1);
    ]
{
   test1.Test3(test2);
   test2.Test4(test1);
}

Naturally, such an example is a bit silly, but use cases are difficult to come up with off the top of my head given it's not possible right now.

By Alexander Morou at Sun, 2011-07-10 11:46 | login or register to post comments

Interesting. Might be a nice

Interesting. Might be a nice addition to C# bringing as interesting new opportunities (in frameworks/librairies, etc) IMHO.

[...]but use cases are difficult to come up with off the top of my head

That got me thinking a bit. I could think of this one which even isn't very difficult to come up with, actually:

// Sample context: in some facade that exposes only proxy in/out object types and wants to hide
// the details of a merge semantics translation from the remote objects' side to that of the proxies.
// Thus, consumers/callers of the facade need not be aware of what makes the remote objects mergeable,
// nor what makes the proxies the local representatives of such remote objects
// (both points known to the facade implementation only)
public TMergedProxy MergeWith<TMergedProxy, TMergeableRemote>(TMergedProxy forMerge, TMergedProxy toMerge)
    where TMergedProxy
    [
        TMergeableRemote RemoteObject { get; }
    ]
    where TMergeableRemote
    [
        TMergedProxy CreateProxy();
        TMergeableRemote MergeWith(TMergeableRemote other);
    ]
{
    TMergeableRemote remoteForMerge = forMerge.RemoteObject;
    TMergeableRemote remoteToMerge = toMerge.RemoteObject;
    return remoteForMerge.MergeWith(remoteToMerge).CreateProxy();
}

(in some hypothetical layer of some hypothetical application, etc)

I'm quite confident there must be a number of others. (Just sharing)

By Cyril at Wed, 2011-07-13 00:28 | login or register to post comments

Re: Closed Systems

Closed systems were the main focus of such functionality.

Let's say you're writing your own framework and you have an abstraction layer that operates on two variations of the same theme. In cases where the underlying functionality of the variations is equivalent (member-wise), but the actual type systems diverge (i.e. they are only similar at System.Object), you might typically end up writing two different methods that are identical, where the types are the only variable.

Structural typing would alleviate that by handling the call dispatch, in cases where you're more worried about code maintainability versus speed (since there's a marginal impact to be aware of, you're dispatching one call which is actually two.)

One area that this might hit a snag is cases where you're exposing some object as an interface, and the underlying implementation may vary and potentially could utilize type-parameters specified by the interface (i.e. the underlying object is generic, with the type-parameters supplied by the generic interface). If you dispatch a call to the underlying object, and it calls for structural typing from one such type-parameter, since the compiler can't realistically know every possible type that can implement that interface, either one of two things would have to happen:

1. If the implementation of the interface uses type-parameters that mirror those of the interface, a secondary dispatch can be made, which will utilize the bridge from the interface to do the dispatch.
2. The fallback mechanism of automatically generating the dispatch implementation would be utilized.

The fact that interfaces need to be able to specify type-parameter structural constraints is why the hypothetical '[' was used in favor of '{', because the body of the constraints was identical to the body of the interface. I'm not fond of the ': this' structure because ':' implies that it derives from some type or implements some interface.

The structural typing implementation behind interfaces would be a bit different. They would merely act as a relay point in cases where types which need to utilize their bridges can do so.

By Alexander Morou at Wed, 2011-07-13 01:44 | login or register to post comments

Lambda the Ultimate

User login

Navigation