Languages and data conversions.

Hi,

This is my first post to these forums, so I'm not sure its exactly the best place. If anyone can suggest a better places for the topic then I'd be very interested in hearing them. Also, I'm not sure exactly on the latest terminology so I'll explain in a bit more detail to start of what I'm attempting to do. Ok, that's the disclaimers out of the way.

I am currently designing a new programming language that is based on a non-text based representation. That is, you will need a more complex editor than just a text editor to modify the code. The programmer will be editing the AST directly. The aim of this programming language will be that everything will be stored & communicated in the same format. That includes language, the byte code, the data, etc. The language will be Object oriented with a lisp feel (closures, etc).

I know the idea of developing a programming language that is non-text based is not original. Does anyone know of any university or other projects that have developed this type of system?

The real reason for this post is that I'm interested in how this type of programming will change the semantics and basic structure of the language. An important aspect of this language will be modifying objects to become data, making objects from data, transforming data and executing data (ie transforming data to a series of instructions).

I'll show a few examples to explain what I'm talking about.

Objects to/from data
--------------------

I'll explain using a simple made up class.

class Foo {
int x;
String str;
Bar b;

int getX()
{
return x;
}
....
}

The Foo class will need to be transfered in communications, written to file etc. However, a lot of the time what is transfered is different from the data format. Lets say I have two formats:

Data FooData: { u32["x"], u8ascii["str"], BarData["b"] };
Data BasicFooData:{ u32["x"], u8ascii["str"] };

The first data FooData has all three bits of data transfered and the second one doesn't include the bar data.

One idea I've had is to create a first class function called a "cast" which allows the programmer to write a function to perform any conversions to other data structures.

Cast FooData(Foo f)
{
return (FooData f.x, f.str, f.b);
}

Cast BasicFooData(Foo f)
{
return (BasicFooData f.x, f.str);
}

And from data to Foo:

Cast Foo(FooData fd)
{
return new Foo(fd.x, fd.str, fd.b);
}

Obviously, I'm not too worried about syntax at the moment. I'm just looking for some good constructs to help the programmer out. From a java point of view these type of conversions are
usually handled by the object or by static helper methods. The idea of cast being a construct separate from the class is that it allows new data conversations to happen later.

So, what do people think of this as an addition to a language? Is this handled in other ways in other languages?

I've got some other ideas I'd like to discuss, but I'll see how this goes down first. :)

Thanks,
David.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

non text based representation

editing the AST directly,Object oriented closures,
- SmallTalk does this.

Your casting technique looks very similar to c++. Is it based on it?

I haven't done C++ for a

I haven't done C++ for a long time so did a few searches to check. I think this is very close, however, in C++ you specify the function by using operator overloading on the type. eg.

class Cents  
{  
private:  
    int m_nCents;  
public:  
    ...
    operator int() { return m_nCents; }
    ...
};

The problem with this is you need to define all the casts with the class. I'd like to move that definition to outside the class but have it operate the same way. So if the above class was in a library and I created a new class, I can add the cast later. eg

class Dollar
{
  ...
};

cast Dollar(Cents c)
{
   return new Dollar(c.getCents()/100);
}

main()
{
   Cents c = new Cents( 2303 );
   Dollar d = (Dollar) c;
}

Is there another way in C++ to specify the same thing?

I'll also check out smalltalk.

Thanks,
David.

[Meta]

Instead of signing with "Thanks, David", why not change your user name to "David Ryan"?

[Meta]

Can I change the display name? I had a look, but couldn't see a way. I filled in the personal information, but that says its not shown publically. I use the userid oobles because its easy to remember on different sites. I'll leave off the "Thanks, David" in future.

When viewing you account you

When viewing you account you should be able to click on edit and change the user name.

You can also use Constructors in c++

Is there another way in C++ to specify the same thing?

class Cents {
    int _i;
    public:
        Cents(int i):_i(i){}
        operator int() {
            return _i;
        }
};

class Dollar {
    int _i;
    public:
        Dollar(Cents c) {
          _i = c/100;
        }
        operator Cents() {
          return Cents(_i*100);
        }
};
Cents c(2303);
Dollar d = c;
Cents c1 = d;

So it allows you the freedom of defining DollarCent either at Dollar or at Cent class.

Related Stuff

If you haven't already seen these:

- A discussion on LtU about Subtext.

- Language Workbenches.

Also

You should at least be aware of Intentional Programming.

Subtext

Thanks for the link to subtext. I think this has many parallels to my work. I'm not sure I like the user interface, but the general goal of a non-text based language is the same. I'll read up on the oopsla papers and see what gems lay beneath the surface.

Katahdin

The language Katahdin (home page) may be relevant here.

I also recommend taking a look at OMeta which could be useful for implementing your language.

CSG

The Cornell Synthesizer Generator by Reps and Teitelbaum. Yes, that's quite a while ago. :) And no, I don't think that there's anything quite like it available today.

Languages and data conversions.

It looks as though I have some more reading to do, especially in the area of intentional programming. In the mean time I've continued to think about the problem I'm trying to solve. This might be hidden away in some intentional programming papers, so if you've seen anything like it let me know.

What I'm looking for is the type of language constructs required when programming with data structures. When using a data structure to define a GUI or some other part of the system, I need a way of converting that to objects, other data or code.

So for instance you might have a data structure for an address. (This is very simplified for the example)

struct Address {
   struct Street {
       string location;
       string name;
       string type;
   }
   string suburb;
   string state;
}

At times you might like to treat this as an object. At other times you might want to convert it to be viewed as part of a page. So for a page you might do a cast like I described above.

cast ViewElement Address.streetView( Address address )
{
   return new Div{
      Text { style="bold", src=address.street.location + " " + 
                               address.street.name + " " +
                               address.street.type
      };
   };
}

I've added in a "streetView" name so that you could have multiple
ways of converting one data structure to another.

The ViewElement which contains data structures that look like HTML could then be converted to an object which could be displayed. So might have something like:

cast VisualNode Text.display( Text t )
{
    return new VisualNode( t );  
   // Lets assume that the constructor sets up the text node
   // for display in the display GUI class heirachy.
}

The idea is that when it comes to programming the system you can do:

main()
{
   // create an instance of an address.
   struct Address a = new Address{ {"1", "some", "st" }, "some", "where" };

   // create a page and display the address.
   struct Page p = new Page{ content=(streetView) a };
   
   // View the page on a display.
   Display d = (display) p;
}

So that covers the basic idea of the conversions. The other idea I think it needs is the concept of views. The idea of a view is that it allows a structure to look like a class. That is, a view is a class that has no data members and uses the data members of a structure or another class. So something like:

view AddressView on Address {

    string getStreet()
    {
       return street.location + " " + street.name + " " + street.type;
    }

};

This way you can make data structures behave like classes. The idea is that its cheap for the VM to wrap the data structure with the methods.

This wrapping could also be applied to classes. This could be useful where a library implements classes that you don't want to extend or are final. The view could also implement interfaces for more flexibility. eg.

interface CommonAddressView
{
   string getStreet();
}

view AddressView implements CommonAddressView on Address
{
   string getStreet();
}

So, any comments or criticisms for these ideas? Has it all been done before?

I'm slow

so I apologize if I'm not cluing in yet! But it feels to me like you can get this with a less different OO language; something not far from Java or C#. Like, would the implicit conversions of Scala be close enough? I don't quite grok what the casting approach does that is better.

Implicit conversions of Scala

Thanks for the link to Scala. I think what they're doing with implicit conversions is very close to what I want.

I think there is a couple of differences. I prefer to make the conversion explicit. I notice that a number of people have made that complaint about Scala's implicit conversions already.

The other difference is that I like the idea of naming the conversion. The advantage being that you can have multiple ways of transforming between two types.

I'm glad to see that the implicit conversions of Scala seem to be liked as it adds weight to what I want to do.

In the end, the language I create is likely to be close in behaviour to other OO languages. I just want to make sure there are better ways of dealing with data structures. Conversion functions are one good way. I think other important element will be class wrappers around data structures. I'm yet to see that concept featured in any language. Although I'm sure it must be done in a language somewhere.

Re: scary implicit conversions

Ja, doesn't C++ have the ability to do snekrit conversions (well, you have to have written the appropriate constructor or operator I guess) which can kinda lead to surprise bugs? Making things explicit sounds like a good default.

Functions?

Pardon the stupid question, but if you want your conversions to be explicit and you want them to be named, isn't that just a function?

The other difference is that

Re: ``the other difference is that I like the idea of naming the conversion.''

That's not a difference. Scala's implicit conversions are named. And, yes, you can write them explicitly if you wish.

Oops.. my mistake.

I just found a nice blog showing me a better example.

http://www.extensium.com/blogs/?p=23

From the example it shows in the blog one thing isn't clear. Can the example in the blog be something like:

foo(Runnable run)

implicit def toRunnable(f:Unit): Runnable = new Runnable { def run() = { f }}

implicit def delayedRunnable(f:Unit): Runnable = new Runnable { def run() = {
f
Thread.sleep(1000)
}}

foo(delayedRunnable{Console.println("foo")})

I know the code isn't correct, but hopefully you get the idea. The concept of having two different implicit conversions, and being able to select the correct one.

But if you select an implicit conversion...

...it isn't implicit any more. :)

Of course, it can be implicit in the case where the conversion is unambiguous.

OTOH, in my experience, explicit conversions should be the default behavior, with implicit conversions only enabled when needed. I use C++ mainly at work, and I have learned to *always* preface single-argument constructors with the "explicit" keywoard, unless I have a good reason not to--every once in a while I get bit by an implicit conversion that surprises me.

Often times, when you *think* you want an implicit conversion--what you really want is a polymorphic function (which can then perform explicit conversion). This latter approach gives you much more control of when conversions are performed. No more trying to figure out if "5" + 23 is "523" or 28... :)

On further reflection...

in C++, the expression "5" + 23 is neither the string "523" nor the number 28.

Instead, it's a wild pointer. :)

And no type conversions are necessary to bring about that bit o' ugliness, either.

A large steaming pile of ugliness

And no type conversions are necessary to bring about that bit o' ugliness, either.

In C++, according to the draft from 1996, a string literal has a type "array of N const char" (2.13.4.1) and, in your example, is implicitly converted to the type "pointer to const char" (4.2.1).

The first thing we'll do..

is kill all the language lawyers. :P

Good catch.

I've always though that "Wild Pointers" (or alternatively, Wyld Pointers) would be a fine name for a rock band.

nice one

I am now of the opinion that

I am now of the opinion that implicit conversions should be avoided at all costs. I like OCaml's approach: everything is explicit, and even the arithmetic operators perform no conversions. This prevents programmer laziness.

What about parameters?

Implicit conversions make me really, really nervous. But implicit parameters are the bee's knees. I wonder if you dislike both, or just conversions?

By implicit parameters, I

By implicit parameters, I assume you mean default arguments to functions? I haven't programmed with those, so I'm not sure how flexible they are. Is there anything they provide that can't be provided by "narrower" function definitions? Given how light function defs are in functional languages, they aren't much of a burden.

Not exactly

I'm sorry, for some reason your last post led me to think you were more familiar with Scala's implicit conversions. Scala's implicit parameters do not have default values, but rather, they are parameters whose values can be inferred from among a particular set of names in the calling scope. For example:

implicit val a: Int = ...
implicit val b: Double = ...
val c: Int = ...

def f(implicit x: Int) = ...

println(f) // "a" is selected as the only correct argument to f

This shows the idea: possible choices for the implicit parameter must be be declared "implicit" by the programmer, the choice of parameter is driven by the static required parameter type (using the normal method overload rules), and in case of ambiguity, the compiler issues an error and requires you to state your intent.

This can be tricky, but I find it to be an enormously powerful tool, and in general, I find it much less error-prone and confusing than implicit conversions.

Incidentally, this is very closely related to Haskell's type classes. The set of visible "implicit" names corresponds to the set of instances, while implicit parameters correspond to a type class context on the function. Haskell's type classes amount to implicit type-directed dictionary passing, so in some sense, implicit parameters can be seen as a bit more general. The downside is that you have to write more boilerplate, there are several upsides: (1) it's more general, (2) you have fine-grained control over the scoping of instances (unlike Haskell, where there is a global set of instances), and (3) you can always override the compiler or resolve ambiguities by hand (i.e., in my example above I can always write "f(a)" or "f(c)" when I need to). (Finally, Haskell also has an extension called "implicit parameters", which I believe are related, but based on a somewhat different idea. In particular, I believe Haskell's implicit parameters are selected by name, rather than inferred by type.)

Sorry if this is all known to you...

No, all new for me. My

No, it's all new for me. My distaste for implicit conversions comes from trying to reason about my own code in C# where I used such conversions. Implicit parameters definitely sound interesting, but anything that removes explicitness harms auditability (by which I mean understandability of the code for any purpose). If I understand your presentation, these instances can be scoped, so perhaps it's a little safer. I'd have to use them for awhile I suppose.

Explicit is better than implicit?

...anything that removes explicitness harms auditability (by which I mean understandability of the code for any purpose).

I think this is stated too generally to be true. How do you feel about implicit typing (i.e. type inference)? Sometimes making something implicit can increase "auditability" by lowering noise and making the code clearer (particularly in the presence of tool support to e.g. show the inferred type).

Actually, now that you mention it

I think I would like a language which interacts with and requires the IDEs to insert the inferred types into the code as comments, so things are auto-documented as you go along.

I think this is stated too

I think this is stated too generally to be true.

You may be right, but I'll stick to it until I hear a compelling counter-example. :-)

How do you feel about implicit typing (i.e. type inference)? Sometimes making something implicit can increase "auditability" by lowering noise and making the code clearer (particularly in the presence of tool support to e.g. show the inferred type).

I think type inference is a wash. You can find many cases where it harms auditability, and many cases where it clarifies and generalizes code (user over-constrains a definition for instance). I can't think of any cases where implicit conversions actually clarify meaning in this way, as they carry a heavy mental burden. I can think of any number of ways in which importing a new namespace can completely change the meaning of code by importing new implicit conversions.

I remember reading the Ada rationale where they tried as hard as possible to make the meaning of code immune to small changes, such as importing a new package. They even had some sort of open challenge, and after something like 5 years, someone finally provided an example where altering a use statement altered the meaning of a program. I think the sort of clarity that provides the developer is a laudable goal.

Ah, the good old Beaujolais

Ah, the good old Beaujolais Effect...

Excellent! Thanks for the

Excellent! Thanks for the ref. I'm actually seriously considering using GNAT Ada in lieu of C for the VM I'm building. Anyone with any experience doing that? All I should need is a small library around GNU Lightning. How is GNAT support for Windows?

Don't know about a binding

I used GNAT on windows for years (as did my students) with no issues.

I can't think of any cases

I can't think of any cases where implicit conversions actually clarify meaning in this way

Implicit conversion betveen fixnums and bignums perhaps?

Hmm, that's a good one, but

Hmm, that's a good one, but it changes the big-O profile of your code behind your back, so I'm still not sure that's a benefit in the end.

What about overloading / typeclasses?

What about overloading / typeclasses? BigNum / Int conversion is just one example of many. Should you explicitly call intToString to print an integer?

I agree that importing a new namespace should never change the meaning of code, but I that's not a reason to outlaw all conversions - just ambiguous ones. Note a similar issue exists with shadowing bindings that most languages get wrong.

Also, with tool support (perhaps the commenting mechanism raould suggested), I think some type inference is an unqualified win.

What about overloading /

What about overloading / typeclasses?

Not a fan of their global nature. But I come from a capability security background, so I'm immediately suspicious of anything global.

[Edit: of course, my comment has to be taken in the context of the language's common idioms. Typeclasses are very common in Haskell, and so when perusing any Haskell code, a developer will already be looking for typeclasses. Theyr'e so fundamental as to be unavoidable. I would still question whether they're a good idea though.]

BigNum / Int conversion is just one example of many. Should you explicitly call intToString to print an integer?

Yes, I think so, or use some convenient short hand for pretty printing.

Also, with tool support (perhaps the commenting mechanism raould suggested), I think some type inference is an unqualified win.

I made a similar argument for supporting custom operators. I agree tools make many things easier, I'm just wary of relying on them too much.

What about typeclasses?

What about overloading / typeclasses?

Not a fan of their global nature.

I actually agree with you on this, but I see it as a separable issue.

I guess I don't have any strong counterexamples to your position. Generally my view is that making everything explicit adds too much overhead to abstraction, when abstraction should be encouraged. Also I don't think there's a high mental cost if it's done correctly.

Generally my view is that

Generally my view is that making everything explicit adds too much overhead to abstraction, when abstraction should be encouraged.

I think I agree with that only when the idiom is a pervasive idiom of the language, like typeclasses in Haskell. Even still, my opinion is that explicitness helps newbies, and given custom operators the syntactic overhead of an abstraction can be minimized. ML seems pretty close to what I envision in this regard.

[Edit: also, easy partial application mitigates the syntactic overheads of explicitness somewhat. ML also supports infix operators, which is why I believe it approaches my ideal in this regard.]

I tend to find typeclasses

I tend to find typeclasses much safer than, say, mutable state when it comes to things happening implicitly. Having access to an instance is no more dangerous than having access to the individual functions in practice, and if you want to narrow down the scope for overloading you can make the relevant type more specific.

That said, I can still construct classes that do things I'll look at and go "yuck", just as I could do similar things with templates in C++. IDEs with pervasive type feedback at least make it easy to get your head around what's going on if you're unsure, which ought to be enough for audit purposes - unfortunately we don't really have that.

The problem there isn't present with single parameter typeclasses - rather, it's in using features like functional dependencies to perform type-level computations which further instance dispatch can depend on.