Let's make a programming language!

Since LtU members are so knowledgable on programming languages, why don't we design the (ultimate) programming language? let's all post our suggestions here to make a lean-and-mean programming language. It would be a experiment filled with fun. I can make the compiler, if a 'standard' comes out of this. I apologise if this has been proposed before (and obviously failed, since there is nothing out).

My initial suggestions are two:

1) put as less as possible into the compiler. Make the language so versatile that all high-level concepts can be easily done with the core constructs.

2) the lowest level of the language should map directly to the hardware, i.e. must be some kind of assembly, in order to allow for very low-level programming. But the language should be versatile enough as to provide the means for doing layers upon layers of abstraction, so one chooses the appropriate level of abstraction for each particular type of application.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

As long as we're defining the perfect language...

How about a language to end the ongoing dynamic vs. static saga. Let's have a language with support for explicit types but also with special syntactic sugar for universal types. :-)

the language

I feel that, in order to fulfill this goal, such a language ought to be very close to Scheme. Just remembering how in SICP, they exercise everything from declarative to assembly, and everything in Scheme.


Self-modifying code and macros are a must. While I'm not averse to parentheses, I feel that most people don't like it. Maybe some sort of graphical representation generated along the lines of XML/XSLT (horrors!) but finally done right could be funny.


And to finish, I'll provoke a bit: Text editors suck! (I've become a fan of the formula editor in Lyx some time ago! Using the shortcuts, I'm just a factor of 1.5 slower with many formulas than with handwriting)

I'd agree, as long as it didn't use...

....S-Expressions. :-)

what's so bad about them?

is it the prefix notation? the parenthesized grouping?

Prefix notation is bad since it departs from conventional mathematics usage, but for crying out loud: this isn't mathematics, it's programming!

Function calls in mainstream programming languages always begin by name, in other words, use prefix notation. So, a java guy would say, taxDeductions( 2004 ) and a Lisp fellow, (tax-deductions 2004). What's the matter here? It also ilustrates that the difference about parenthesis usage is the location of the verb: outside in most mainstream, inside in Lisp.

Prefix notation isn't used in OO conventions though, where you first spell out the subject and then the verb to declare something about it: person.father, while (father person) looks far more natural ( to me, at least )...

Not to say it frees you from typing a lot of annoying separators ( besides read-friendly whitespace ), as in:

( OCaml )
[1; 2; 3; 4; 5; 6] vs '(1 2 3 4 5 6) or

( C )
printf( "heya! %s, %s, %s, good to see 'ya all!",
"john", "mary", "frank" ); vs
(printf "heya! ~A, ~A, ~A, good to see 'ya all!"
'john 'mary 'frank))

Is there something i'm missing?

Missing the punchline.

Comment was meant to be more of a ribbing. First, I know it always grates the Lisp'ers when you complain about s-expressions. Second, we all know that any language will eventually be doomed to re-inventing Lisp. :-)

My opinion on s-expressions is that they are fine in the small but I do like clear demarcations of functions, modules, objects, etc... When I look at an s-expression program, I get bogged down in seeing "data". That can be a good thing since the distinction between data and program is arbitrary and minimizing that distinction can be used to good effect. But I'm still left with the problem of parsing it for visual clues.

But then, this is a long-running argument - of which is I was tersely alluding to in my feeble attempts at joking. IIRC, McCarthy experimented with M-expressions early on, but it did not lend itself to the use of Macros (so it was also dropped early on).

huh?

"But I'm still left with the problem of parsing it for visual clues."

how about editors with syntax-highlighting? ;)

I'm being entirely subjective....

...and not entirely reasonable along these lines. We'd get into an argument about what constitutes a readable syntax. Do we go with what makes the most sense from the standpoint of legacy (C, C++, Java, etc). Or do we go with a syntax that makes the most sense for expressiveness (Lisp, Smalltalk, etc). Or do we lay a new path that best represents the unique ideas that we are trying to convey with our new language.

Tools for manipulating programs are always interesting and Lisp has always been at the forefront on this (as well as many other things). Notes that the environment for Smalltalk is an important component in the programming experience, with it's various ways to get a handle on the inspection and manipulation of things within the image.

In the end, you could probably convince me (as I'm easily assuaged). But then you still have to convert all the masses out there that there are benefits to learning a syntax that they are not used to (whatever language du joir they happened to be using). So even if we agree with ourselves what the syntax should be and how much power it gains us, there's a lot of evangelizing that would have to be done to gain acceptance.

All this kidding aside, I think designing a language is something that's tricky. In the end, if you write a language, the first rule of thumb is that you write it to scratch an itch (be it ease of expression, ease of maintenance, or pedagogy). Once written, that's when you worry about whether it's useful for anyone else. So before you even talk about the form and substance of a language, you have to decide what exactly the itch is that you are trying to scratch.

Since LtU represents a somewhat diverse community of users with the commonality being issues concerning programming languages, I'm not sure that we really share itches. IOW, why are we wanting to design a language? A kitchen-sink approach is extremely hard (say Ada or the current effort in Fortress).

indeed

i know i'm happy with Scheme, OCaml and Haskell. and i know Java, C++, Perl and other grunts are happy too... :)

maybe a new kitchen-sink language could go by the name Babel. ;)

What about just a very small change

Why not put the function outside the parentheses, and default to quote?

(1 2 3 4 5 6)
printf("heya! ~A, ~A, ~A, good to see 'ya all!"
'john 'mary 'frank)

I certainly find foo() a lot more readable than (foo).

no

"I certainly find foo() a lot more readable than (foo)"

Perhaps you're just used to it from Algol offsprings?

It's a simple convention: the first element in an unquoted list is _always_ a procedure. Putting it outside the parenthesis would break the all-important program-as-data concept because of a tiny nitpick which shouldn't exist if you're well aware of the convention...

it should also prove to be trickier for the Lisp interpreter to parse...

"Putting it outside the paren

"Putting it outside the parenthesis would break the all-important program-as-data concept"

Naw, it would just be less simple to parse.

not algol, common sense

"Perhaps you're just used to it from Algol offsprings?"

It's not Algol, but general math, where you can always put parentheses around a subexpression without changing the meaning.

Also, putting things between any kind of braces would suggest containment or protection, which is quite the opposite of execution.

not common sense...

it's just a notation.

Yes, putting things inside parentheses certainly do convey a containment feeling. And that's great for the idea that programs are data.

On the other hand, it's just a convention: Lispers now the first element in an unquoted list is ( should be ) a procedure, so there's no confusion.

There are languages which use postfix notation, and that's just another convention as well. Just like the math notation, which BTW, has changed a lot through the centuries...

It's not common sense: it's made up conventions useful for their domain of application.

Even math has it wrong

Even the mathematicians got it wrong, and we live with it. Everyone is used to seeing the syntax, y = f(x), which would have made much more sense if it had been written y = x : f, or if you want x -( f )-> y. The reason is that composition reads most naturally from the left to the right, just as in reading.

Due to this, we see Haskell with it's composition operator (.), with things like: result = (f . g . h) x, which really is h applied first, then g, then f. With the alternative notation, it becomes obvious: x -(f g h)-> y, means the result of piping x through f, then g, then h, putting the result in y.

F# and some Ocaml people have started doing this by defining |> and >> as (left) application and composition. I think that's a very nice convention.

Naturally?

composition reads most naturally from the left to the right, just as in reading

Left-to-right is just another arbitrary (western) convention. Other cultures read right-to-left, or even vertically.

Doesn't matter.

Sure, there are different reading orders but current programming language are pseudo-English so to be coherent the notations used should be left-to-right..

That said I don't find x |> f -> y easy to read but that's probably just because I'm not used to this notation.

Math

current programming language are pseudo-English

Might be, but he was complaining about math notation, for which this isn't the case.

Left to right is the norm in math

When did you ever see a (math-) diagram where the arrows went left? I bet is a small percentage. I agree that this is "western" thinking, but western thinking is dominant within science.

At least one algebra book used the notation 'x f' for "x through f", and it was for sure alien, but it does make sense.

PKE.

Of course, we do call 'em

"arabic" numerals for a reason...

Yes...

... because they're really Hindu numerals.

(Actually there is a reason: because the Arabs adopted them and we got them as hand-me-downs from the Arabs.)

tiny nit

Hindi (language) numerals not Hindu (religion) numerals

Even tinier nit

It is Hindu (civilization - Indus|Indian) numerals not Hindi (which is a very young language ~13 A.D atleast) numerals

touche

touché

Reverse application and the lambda-calculus

Putting the argument before the function in applications in the lambda calculus makes beta redexes humanly easier to parse, since the part of the lambda abstraction that takes the argument, the lambda binder, is then next to the argument it takes.

This is the item notation used in Kamareddine & Nederpelt's 1995 Refining reduction in the lambda-calculus, which puts it to use to expose some redexes in terms lying just beneath the surface.

Similar to a CPS notation?

The item notation in this paper looks similar to the notation I use for CPS transformation. To simplify CPS notation the arrow is equivalent to:

z->x.y  ==  (\x k.k y) z

With this notation, the CPS transformation of ((\x.(\y.(\z.z d)) c) b) a simplifies in:

((\x kx.kx ((\y ky.ky (\z kz.kz (z d))) c)) b) a
(b->x.(\y ky.ky (\z kz.kz (z d))) c) a
b->x.c->y.(\z kz.kz (z d)) a
b->x.c->y.a->z.z d

Yes, quite similar

Have you written this up?

I did not write any paper

I did not write any theorical paper on the subject ;)

In fact, that works exactly the same without CPS ((\x.y) z == z->x.y) and I cheated in my reduction by dropping the brackets at step 3. You would normally need a variable substitution before you can do that.

Self-modifying code and

Self-modifying code and macros are a must.

Why do you say that? I'm all for metaprogramming, but I don't see the need to resort to something as inconsistent and complicated as macros.

So you end up with users

So you end up with users calling eval on quoted code instead, what's the biggie? Any system that doesn't let you manipulate an AST for the code in question is going to cause major pain.

I know. What I'm saying is

I know. What I'm saying is that macros aren't necessary to allow for such things. Really, macros are just one big premature optimization and are inflexible to boot.

Pretty random, abstract

Pretty random, abstract claim.

Depends on what your definition of a macro is. What would you use instead? Have you ever programmed in Scheme or Lisp? How are they related to optimizations?

Any function that operates on and generates code is a macro. They're also useful for creating new syntax, DSLs, etc. Not particularly optimization tools though.

Depends on what your

Depends on what your definition of a macro is. What would you use instead? Have you ever programmed in Scheme or Lisp? How are they related to optimizations?

Lisp-like macros would be what I'm referring to, yes.

Any function that operates on and generates code is a macro.

Not quite. A macro is a function that operates on/generates code before normal code is executed. You've actually noticed something about macros that most people don't--the fact that macros and functions overlap completely in functionality. In essence, what I'm proposing is moving the most useful thing about macros--compile time execution--into the domain of functions, either automatically by the compiler or explicitly by the programmer. This has several advantages, including the fact that there is now a single interface to do a single thing and that compile time functions can be executed at compile time and run time without any change to said function.

You misunderstand Lisp

Lisp macros are executed at runtime! Presumably some get evaluated at compile time too as optimizations. Macros are not about compile-time evaluation at all.

You've actually noticed something about macros that most people don't--the fact that macros and functions overlap completely in functionality.

I think everyone understands that. A lisp macro is a function that operates on code.

No, I understand that

No, I understand that perfectly well. In fact, in the very beginning I acknowledge this fact:

A macro is a function that operates on/generates code before normal code is executed.

If I did make reference to macros being executed at compile time after that, it was simply for convenience, since I don't recall the exact name other than the somewhat verbose and not very descriptive 'pre-execution time' ;-)

Can you explain your

Can you explain your arguments then? You claimed macros are "just one big premature optimization and are inflexible to boot". I don't see where this is coming from. Then you posted about compile time execution, stating that you want the same interface for compile-time and run-time macros. But lisp already has this, and it isn't about compile time execution. So I thought you had a misunderstanding of Lisp macros, but apparently not.

So what are you arguing for? You usually have a good point, it just takes a while to beat it out of you ;)

optimization

The only optimization case that I can think of for macros is forcing inlining of code, in (-drumroll-) C (pre-C99).

Partial evaluation

In essence, what I'm proposing is moving the most useful thing about macros--compile time execution--into the domain of functions, either automatically by the compiler or explicitly by the programmer. This has several advantages, including the fact that there is now a single interface to do a single thing and that compile time functions can be executed at compile time and run time without any change to said function.

You're describing partial evaluation. See A Hacker's Introduction to Partial Evaluation, and the Online Bibliography of Partial Evaluation Research. It's often been discussed here on LtU. There are some other links from the wikipedia article.

A related subject is abstract interpretation, which can loosely be described as executing a program before all its input is available.

Partial evaluation can move computations to compile time, which overlaps some of what macros can do, but it by no means replaces macros. The unique uses for macros, particularly in Lisp and Scheme but also more generally, are to (1) create new binding constructs, (2) implement unusual evaluation orders, and (3) define minilanguages for data definitions and other kinds of DSL (summarizing from this post by Matthias Felleisen).

None of these things are achieved by partial evaluation alone, and for the most part, these features require a syntactic transformation in principle, so whatever you do to achieve these things is going to end up being macros, in some form.

You're describing partial

You're describing partial evaluation. See A Hacker's Introduction to Partial Evaluation, and the Online Bibliography of Partial Evaluation Research. It's often been discussed here on LtU. There are some other links from the wikipedia article.

A related subject is abstract interpretation, which can loosely be described as executing a program before all its input is available.

I don't think that's really what I'm talking about. I'm talking about compile-time execution of functions, e.g.:

def pow(x, y):
    if y == 0: return 1
    if y == 1: return x
    return x*pow(x, y-1)

x = pow<compile-time>(2, 16)

Where pow(2, 16) is computed at compile time (or at run time if it's run in an interpreter).

edit:
Angled brackets didn't show up.

Yeah, I'm pretty sure that's

Yeah, I'm pretty sure that's partial specialization.. and nothing to do with macros or metaprogramming.

Is it? Oh, my mistake.

Is it? Oh, my mistake. Either way, it may not directly replace macros, but it covers most of the cases. When you do need full-blown metaprogramming like in macros, a fairly rare event, it's not terribly hard to do using compile time functions and, say, the metaprogramming capabilities in python.

Metaprogramming is what it

it may not directly replace macros, but it covers most of the cases

Metaprogramming is what it sounds like - writing metaprograms - that is, programs that result in programs. Not compile time evaluation, or anything. Don't get confused with C macros or C++ metaprogramming.

You give some good points, but these get lost in misunderstandings. Learn Scheme via SICP (free online); pick up OCaml; and while you're at it, read the excellent CTM. I've not read TAPL myself yet, but I gather it's good and its my personal next step.

I personally find Python's metaprogramming poor. It lacks the power and the simplicity (which you obviously desire) of lisp macros. Its true that metaprogramming in lisp is a relatively rare event, but its extremely useful and powerful when it is used. If you want you can write macros that write macros that write macros that write code.

Metaprogramming is what it

Metaprogramming is what it sounds like - writing metaprograms - that is, programs that result in programs. Not compile time evaluation, or anything. Don't get confused with C macros or C++ metaprogramming.

Yes, I know what metaprogramming is.

You give some good points, but these get lost in misunderstandings. Learn Scheme via SICP (free online); pick up OCaml; and while you're at it, read the excellent CTM. I've not read TAPL myself yet, but I gather it's good and its my personal next step.

What for? I already know quite a bit about them.

I personally find Python's metaprogramming poor. It lacks the power and the simplicity (which you obviously desire) of lisp macros.

Python's metaprogramming (eval, exec, etc.) is fine for the very tiny amount of metaprogramming that's actually done in it. Most of the time, macros aren't even used for their metaprogramming capability, but rather for when they're executed.

What for? I already know

What for? I already know quite a bit about them.

To learn? To me it seems your knowledge of them is actually quite limited. You seem interested in PLT, so I thought you might want to learn about it. It would also give you a wider perspective on the design of your own language. You also seem to reinvent things a lot, and learning what has already been done would be good.


CTM is the most eye-opening programming book I've read, and most people will agree with this. After this, you'll know Scheme, OCaml and Oz: quite a powerful set of languages that python can't compare to.

To learn? To me it seems

To learn? To me it seems your knowledge of them is actually quite limited.

How so?

You also seem to reinvent things a lot, and learning what has already been done would be good.

Perhaps, but this isn't necessarily a bad thing. It means I actually understand what I'm implementing and I've already thought it through.

EasyExtend

I personally find Python's metaprogramming poor. It lacks the power and the simplicity (which you obviously desire) of lisp macros.

It got quite more powerfull recently. Incidentally my main motivation to create EasyExtend besides curiosity and fun was to enable partial evaluation and algebraic manipulation of expressions in Python ( without Guidos permission ;). In EasyExtend the meta-programming is compile-time only: the syntax transformations have to be performed before bytecode generation. The system is still in a rough shape i.e. there is no syntax sugar or templating for syntax transformers.

PE "vs" macros

Yes, compile-time execution of functions is partial evaluation. You may not be recognizing that it's the same as what you're describing, because of unfamiliar terminology.

Either way, it may not directly replace macros, but it covers most of the cases.

This isn't correct in general. It's true that in some systems, macros can be used to move computations to compile time, but this is not their primary purpose (or shouldn't be). In my previous comment, I summarized the three categories of motivation for macros, none of which are replaced by compile-time functions. Those three categories of applications for macros refute your belief that "macros and functions overlap completely in functionality".

It sounds as though you may be generalizing too broadly from your experience with macros in particular languages (perhaps C or C++)? In that case, please try to phrase things more like "covers most of the cases in a language like C", otherwise needless argumentation results — we've already seen that in this thread.

When you do need full-blown metaprogramming like in macros, a fairly rare event, it's not terribly hard to do using compile time functions and, say, the metaprogramming capabilities in python.

This is only partly true, at best. There's some overlap, because metaprogramming is a broad term. However, the sort of macros we're discussing ("Lisp-like") let you extend the syntax of a language, and neither compile-time functions nor Python's runtime metaprogramming features actually do that. [Edit: perhaps examining EasyExtend, mentioned in another comment, will help in understanding the purpose of "real" macros in the Python context.]

There are some cases where you can get away with using Python's existing syntax in clever ways, in conjunction with metaprogramming, to do something that you might otherwise use macros for, but this is not a general solution.

In addition, runtime metaprogramming, as in Python, tends to reduce the ability to statically analyze programs, which doesn't just affect compilers - it also tends to make programs more difficult for humans to reason about.

This isn't correct in

This isn't correct in general. It's true that in some systems, macros can be used to move computations to compile time, but this is not their primary purpose (or shouldn't be)

It's how their used most of the time in the languages I've seen, e.g. common lisp.

I summarized the three categories of motivation for macros, none of which are replaced by compile-time functions. Those three categories of applications for macros refute your belief that "macros and functions overlap completely in functionality".

A function is piece of code that takes data as input and returns something. A macro is a piece of code that takes code as input and returns code to be executed. Because code can be data, a function in a language with run time metaprogramming capabilities is exactly equivalent with macros barring time of execution. To demonstrate this, consider the usual 'until' macro using python:

def until(condition, body):
    while not condition():
        body()

In this case, it doesn't even need to use any metaprogramming capabilities, because the sole reason for this even being a macro in the first place is to control when 'condition' and 'body' are executed. Of course, if you really wanted to, you could write it using metaprogramming:

def until(condition, body):
    while not eval(condition):
        exec body
This is only partly true, at best. There's some overlap, because metaprogramming is a broad term. However, the sort of macros we're discussing ("Lisp-like") let you extend the syntax of a language, and neither compile-time functions nor Python's runtime metaprogramming features actually do that.

There's probably some convoluted way to shoehorn syntax extensions into functions, but I'm not even going to try. I don't find them useful at all, so I guess if you want syntax extensions macros do have a use.

In addition, runtime metaprogramming, as in Python, tends to reduce the ability to statically analyze programs, which doesn't just affect compilers - it also tends to make programs more difficult for humans to reason about.

Ah, quite the contrary. As I've already demonstrated with my 'until' example, macros have many uses, only some of which absolutely require the use of metaprogramming. In those cases, the resulting functions become clearer because you're using normal code constructs. I do admit, however, that the cases in which you absolutely need metaprogramming would probably be slightly more complicated, but I've honestly never seen such a situation in the first place, so I'm not inclined to worry it.

Syntactic abstraction

It's how their used most of the time in the languages I've seen, e.g. common lisp.

I'm not sure about "most of the time", but I agree that macros are commonly used for that purpose in Common Lisp. That simply underscores the point, though, that you can't generalize from the example of one or two languages. For example, macros are used primarily for syntactic abstraction in Scheme, not for general compile-time computation.

What "syntactic abstraction" means is being able to take some pattern of syntax that occurs in multiple places in a program, and simplify the program by abstracting it out to a common place. Of course, it's possible to use this ability to do the same thing as an ordinary function does (i.e. procedural abstraction), but we both agree that this isn't a real justification for macros. The justification for macros comes in when you use them to abstract syntactic patterns that you can't abstract away with functions. Your 'until' example is a good one: the client code that uses 'until' won't look like this:

until x > 10:
  x = x + 1

Why won't it look like that? Because you're missing syntactic abstraction. Of course, you might try to "shoehorn syntax extensions into functions", as you said, but at that point you're implementing a macro system, and that'd be the first time that you're doing anything that has to do with the real purpose of macros.

A function is piece of code that takes data as input and returns something. A macro is a piece of code that takes code as input and returns code to be executed. Because code can be data, a function in a language with run time metaprogramming capabilities is exactly equivalent with macros barring time of execution.

By "code can be data", and given your "until" example, you seem to be referring to higher-order functions. This has little to do with macros. The languages with good macro facilities also have higher-order functions, but that has nothing to do with syntactic abstraction. So "exactly equivalent" is incorrect, except in the one restricted case we've discussed, i.e.:

You're saying that in cases where macros happen to be (ab)used in order to execute procedural abstractions at compile time, that this usage can be replaced by a feature in which functions execute at compile-time. We agree on that much, but you can't extrapolate from this to say anything about the usefulness of macros for syntactic abstraction, since nothing you've described so far has addressed that point.

Ah, quite the contrary. As I've already demonstrated with my 'until' example, macros have many uses, only some of which absolutely require the use of metaprogramming.

What you've demonstrated with your 'until' example has very little to do with macros. The cases you're talking about optimizing are already optimized in languages such as Haskell, SML, OCaml, Scheme, Erlang (the list goes on) in a way that has nothing to do with macros — some of those languages don't even have macros. You seem to be proposing to duplicate an aspect of the compilation model of those languages, which is fine. However, you shouldn't confuse that with replacing macros.

If you're interested in an optimization such as compile-time functions, you would probably get a lot out of learning some more theory. Books like SICP, CTM, PLAI, and EOPL, all linked to in the Getting Started thread, are well worth reading, and will give you a good grounding. If you want something to focus on as a motivation, you might look into the lambda calculus, which is a lot simpler to learn than it might sound (see links on above page), and it's an ideal system for exploring partial evaluation. There are also systems available which will help you do that exploration.

Finally, a lot of the work on compilation is extremely relevant. As Paul Snively put it here, "under the hood, all compilers end up being functional and relatively close to the Lambda Calculus". I have a couple of other relevant links in this comment.

I'm not sure about "most of

I'm not sure about "most of the time", but I agree that macros are commonly used for that purpose in Common Lisp. That simply underscores the point, though, that you can't generalize from the example of one or two languages. For example, macros are used primarily for syntactic abstraction in Scheme, not for general compile-time computation.

Syntactic abstraction? As in, reader macros?

Why won't it look like that?

Because it'd look like this ;-)

while x <= 10:
    x += 1

I don't believe in specialized syntax outside of the compiler, which, even though they can't be replaced by functions, I don't think they need to be.

What you've demonstrated with your 'until' example has very little to do with macros.

It has everything to do with macros sans syntax modification, which I've adressed above. If you want to discuss my reasoning behind that, it's fine, but I assure you that functions can replaced every use of macros sans syntax modification.

Could you explain to me how

Could you explain to me how functions (and because I'm in that kind of mood I'd like to emphasise /pure/ functions) can build and declare a new compile-time-visible type from scratch? Without access to an equivalent of Template Haskell's quotation monad?

I'm not sure what you mean

I'm not sure what you mean by "compile-time-visible type," could you explain?

Visible for purposes of

Visible for purposes of static analysis. You don't have to run the program just to know the type exists.

How about something like

How about something like this:

def reverse_arg(function):
    return (x,y => function(y, x))

def inverse_class(data, functions):
    functions = functions.map(([x,y] => [x, reverse_arg(y)]))
    return data::[**functions]

obj = inverse_class(
    1, {
        "sub": (x,y => x-y))
    }
)

obj.sub(2) # 1
obj.sub("1") # error, can't subtract 1 from "1"

In this case, inverse_class would be automatically executed at compile time and thus 'object.sub' can be checked as well.

edit:
Changed 'print' to 'sub' in example.

Purpose of macros

It has everything to do with macros sans syntax modification

Right, and what I've been trying to explain is that the only purpose of macros is syntax modification. What's confusing you is that it's possible to use syntax modification in the compile-time optimization way you're describing. However, eliminating the need for that falls under the category of "ordinary compile time optimizations" in any halfway-decent compiler (as described in the compilation links in my previous comment), and has nothing to do with macros.

Right, and what I've been

Right, and what I've been trying to explain is that the only purpose of macros is syntax modification.

What do you mean by syntax modification? It's not too often you actually see a reader macro used in the wild.

Let's slow down

Curtis, we're having some kind of misunderstanding about the nature of macros. However, this really isn't the place to hash that out. I'll send you a private email about that.

In the meantime, speaking as an administrator of the site, please slow down on the number of comments you're posting here. In the week since you registered here, you've been the most prolific poster, almost double the next highest (who is a long-time member).

Much of this has been because you've been discussing ideas of yours for which the rest of us have no details (type inference, and macros or compiler optimizations). LtU depends heavily on links to papers or articles. Before posting further on these subjects, please post a more detailed writeup of what you're thinking of, on your own website or blog, so that you can link to them in future comments here. Or, as Philippa suggested, post an interpreter or other semantic description. That'll cut down on the need to hash out details here, and avoid disturbing our regular readers.

Regarding an interpreter,

Regarding an interpreter, it'd probably be easier to start with a stripped-down "core" language that illustrates all the important points - or easier yet, the abstract syntax tree for one. I think I could see where to start building one from the descriptions given, but I can't be certain it'd be the right thing.

If you'd like, I can update

If you'd like, I can update the grammar I have of the language and upload it. I stopped using it a couple months ago when I really started getting parts of the interpreter done, as that's my preferred method of documentation, but it shouldn't take too much effort to update it.

Actually, nevermind. The

Actually, nevermind. The interpreter should be done in a couple weeks, so it'd probably be easier to just release it as a mockup.

heh

Sorry I didn't notice this sooner, apparently it got lost when I replied to a message :/

Curtis, we're having some kind of misunderstanding about the nature of macros. However, this really isn't the place to hash that out. I'll send you a private email about that.

I recently changed email addresses. The new one is now in my profile.

In the meantime, speaking as an administrator of the site, please slow down on the number of comments you're posting here. In the week since you registered here, you've been the most prolific poster, almost double the next highest (who is a long-time member).

Heh. Sorry, I can get carried away sometimes in topics like this.

Much of this has been because you've been discussing ideas of yours for which the rest of us have no details (type inference, and macros or compiler optimizations). LtU depends heavily on links to papers or articles. Before posting further on these subjects, please post a more detailed writeup of what you're thinking of, on your own website or blog, so that you can link to them in future comments here. Or, as Philippa suggested, post an interpreter or other semantic description. That'll cut down on the need to hash out details here, and avoid disturbing our regular readers.

Aye. I've tried to be at least somewhat abstract in my posts regarding the concepts I'm talking about, but I find that writing code helps to convey what I'm talking about better than just text alone, although I can tell I definitely need to relegate code examples to a secondary role and keep to text for the main concepts.

Metaprogramming is what it

Metaprogramming is what it sounds like - writing metaprograms - that is, programs that result in programs. Not compile time evaluation, or anything. Don't get confused with C macros or C++ metaprogramming.

You then replied: Yes I know what metaprogramming is. The until example isn't what macros are used for. You've already acknoledged this is partial evaluation, and not metaprogramming.

Now you seem to think macros are about compile-time evaluation again. Seems we are going round in circles.

Considering 'until' is a

Considering 'until' is a fairly common example of macros, I'd say they are used for things like that.

Partial Evaluation with Call-by-Name

Couldn't you completely replace macros with partial evaluation if you also had call-by-name?

No

Lisp-style macros let you:
1) extend syntax
2) let add domain specific optimizations
3) let you generate code from a specification or arbitrarily transform code

These are three things that cannot be replaced by uses of HOFs or by having compile-time functions (unless they can be used to splice in code, in which case they are macros) or by PE. Run-time metaprogramming techniques (reflection) allow 2 and 3.

extend syntax

You can extend the syntax of Lisp as long as it looks just like Lisp. That is what I thought was ironic about the video posted recently using Lisp to recreate a DSL by Martin Fowler. The first thing the person does is make everything look like Lisp, and not like the original DSL. This seems to me like an instant disqualification.

Are there languages that allow sublanguages to be defined down to the lexical parsing level? [..googles for "scannerless parsers"]

Lexing & parsing is a solved problem

Camlp4 allows you to use different lexers. It's pretty flexible.

On the Lisp/Scheme side, writing a lexer and parser for a language is the easy part, especially if you use a parser generator and even a lexer generator — some people don't like doing that, but it does tend to make it easy.

It's a pity Fowler didn't just do that. He must know how easy it is, and perhaps assumes the audience would know that — parser generators are one piece of academic computing technology which propagated to the mainstream a long time ago.

There are numerous good parser generators available for Scheme, and presumably for other Lisps. Implementing the semantics is where you really want to bring the heavy machinery to bear, though. In Scheme, implementing a complete DSL can involve little more than writing a few macros and snarfing the rest of the features you need from Scheme.

Assuming you already have the language design — the one part that can still require real work — and assuming there's nothing terribly unusual about the language, implementing a complete small DSL can take a few hours, including lexical syntax and grammar. Of course, you do need to be familiar with all the tools.

PLT Scheme is a good place to look if you want to see real languages implemented in Scheme. It includes its own parser generator library, and supports language implementation in a number of ways: in its module system; in its macro system, which includes a syntax object representation that includes precise source location info; and in its IDE (DrScheme). It has an implementation of a version of Python (see From Python to PLT Scheme), as well as subsets of Java used for teaching (ProfessorJ). The DrScheme IDE's ability to syntax highlight and graphically annotate code (e.g. point from uses of variables to where they're bound) works on these languages because of the underlying syntax system — see the screenshots in the linked Python paper. Of course, these language implementations are more complex than the DSL implementation I described above, because they implement big languages.

Factor allows you to write

Factor allows you to write 'parsing words' from where you can parse any syntax, and generate Factor code to be spliced in. In the space invaders emulator I wrote I built a DSL based on the Z80 instruction set. Factor would parse this and produce the factor code to emulate that instruction. Instructions looked like:

INSTRUCTION: CALL P,nn ;
INSTRUCTION: SBC A,B ;

The INSTRUCTION: word was a parsing word. It analysed the input stream up to a semicolon and used parser generators to parse and produce Factor code from this. The code is here.

Reader macros

Common Lisp has reader macros that (modulo an "escape") would allow you to do that, though I will admit I wasn't really aiming at them (though I wasn't excluding them either). Either way, what defmacro does can still be described as extending syntax, but as you say, not at the lexical level. Others have pointed out other languages. Though I should add, in response to the post on Factor, that Forth (a direct predecessor of Factor) also has parsing words.

extend syntax

I think it would be useful for a language to provide a string literal syntax that (almost) never requires escaping. Use parens or braces as the begin/end string markers, but allow the string to contain other braces if they balance. Only stray unbalanced parens would require an escape and the escapes wouldn't compound as strings are nested. Strings containing code containing strings containing code containing strings would not need any escapes.

(string (the stuff between parens is a literal string (this is still in the string) string ends here->)  )

PostScript

I'm not sure whether you know this or not, but this is one of the string syntaxes that PostScript (and thus PDF) supports (as described in the PostScript Language Reference Manual 3rd Edition (7.5MB PDF)

Access to syntax

Anything macro-like that you implement with call-by-name would still be used via syntax that looks like an ordinary function call. To change that, you'd need a way to pass syntax to a function, before it's given the meaning that it would usually be given by the language parser. Once you allow that, you essentially have macros.

Lisp-style macros are

Lisp-style macros are effectively syntactic sugar for what I've described. They seem one of the saner ways to go about metaprogramming to me.

What did you describe? All

What did you describe?

quote and eval.

quote and eval.

Macros are overboard for 99%

Macros are overboard for 99% of the time metaprogramming is needed.

?

What is your replacement?

See my reply to Anton.

You might think so.

You might think so. Personally, I've another 99% of cases to run with because half the things you're talking about don't need metaprogramming in many languages anyway.

Could you elaborate? I'm

Could you elaborate? I'm having a hard time understanding what you mean.

Even though you know

Even though you know apparently metaprogramming is about writing programs which generate programs, all your examples have focused on compile-time evaluation. Which has nothing to do with metaprogramming.

So 99% of the code that needs metaprogramming probably does need macros.

Even though you know

Even though you know apparently metaprogramming is about writing programs which generate programs, all your examples have focused on compile-time evaluation.

That's because it's the most common use of macros. If you give me a situation in which metaprogramming is useful, I'll gladly show you how it would be done.

I disagree with your

I disagree with your one-sided characterisation of meta-programming as program driven code/AST generation. Computational reflection is clearly a meta-programming facility and Python as well as Ruby are quite strong at it right now.

Your idea of "99% of the

Your idea of "99% of the times you need metaprogramming" has been influenced by things that don't actually need it and can be done via partial specialisation or lazy evaluation (Haskell has pattern-matching as part of the core language and as the main - or indeed only in the absence of IO - driver for evaluation, but all other control flow is expressible in terms of pattern-matching and recursion alone, to the extent that continuations can be implemented as a monad). I can think of multiple use cases for metaprogramming facilities off the top of my head that don't come into that 99%, whereas your examples don't spring to mind at all.

Metaprograms that inspect and optimise code, possibly using partial evaluation, would be something that occur to me - but the use of having them rather than a sufficiently smart compiler that can figure out where to perform partial evaluation would be in encoding extra domain knowledge. Using GHC's rules facility to add fusion and related optimisations would be an example.

Another might be generating accessors, datatypes etc from a database, or alternatively generating (or building code to generate, remove and otherwise administrate) the tables etc in the database from a type. At risk of recalling the O/R mapping wars, the latter in particular's something I want to have a play with - it seems a sensible way to get all the definitions in one place.

Your idea of "99% of the

Your idea of "99% of the times you need metaprogramming" has been influenced by things that don't actually need it

I know. I was giving examples for how compile time functions remove the need for the most common usage of macros, not necessarily metaprogramming.

Too many cooks

I don't think the LtU membership could agree on enough details to produce one language (in order to satisfy everybody, it'd wind up like PL/I, but less concise). It could be interesting to have the discussion so like-minded people can get together and make several languages.

Re: Too many cooks

Yeah. The best languages are always started by a single person trying to fufill a real goal. Besides, we all probably have our favorite language already anyway.

Besides, we all probably

Besides, we all probably have our favorite language already anyway.

We just haven't implemented them yet. ;)

what problem would it solve?

Trying to get together a more or less random bunch of people to design and implement a language invariably leads to massive languages that lack elegance or conceptual consistency. See Fortress, Groovy or Perl 6 for examples.

It might be better to ask what problem would such an LtU language solve? Who would be the users? Why would they choose this language rather than an existing ones with large communities and codebases?

Having said that I think there's one problem that we could all agree on. How do we advance the state of the art of programming language design as perceived by the mainstream?

This is a question that we on LtU have implicitly and explicitly asked many times over the last few years but we've never devised an effective answer.

First, Advance the State of Education.

Someone quoted me saying:

Academics are continually chewing pieces off of impossible and making them merely difficult.

But I spend my time translating the difficult into the understandable.

I try to read and understand research papers well enough to translate them into something a motivated professional programmer can understand and most importantly use. (For example, with The Monad.Reader and three years of explanations on the #haskell irc channel.)

My experiences on the #haskell channel imply that there is not enough organized and accessible documentation for the research we already have.
I would much rather create a "code as literature" website to teach elegance by example, or refactor all the threads on LtU into wikipedia-style articles.



For example, there's a common pattern on the original wiki where a wikipage will start out in ThreadMode, and once the activity has died down, will be refactored into DocumentMode, FaqMode, or whatever is appropriate for the subject. That could easily be applied here on LtU.


Increasing the accessibility of the knowledge already on LtU will advance the state of the art of programming language design as perceived by the mainstream.

--Shae Erisson - ScannedInAvian.com

Not such a bad idea

How about we exchange some emails and see how we can make this happen?

Good Idea!

I really like the idea of LtU threads refactored into somekind of LtU wikipedia. It'd certainly make finding some of the wisdom buried in the archives much easier to find.

Need help?

Just let me know what to do :).

nifty idea

As educational as it would be, I'd have hard time dragging myself away from it. It would totally decimate my productivity, but it would still be worth it. ;-D

Status ?

What's the status on the wiki idea ? It would demand a large editing effort, but it would be a great resource.

I am willing to help too.

Update

We started thinking about it and kicking the idea around. Give me a couple of more days to think thing through.

Anyone willing to help should send me an email, so I can get in touch.

address ?

What's your email address ?

Why should we?

How do we advance the state of the art of programming language design as perceived by the mainstream?

Hmm, to make us feel good that we are doing such an important and nontrivial job?

I think the fundamental decisions would be difficult

Fundamentally OOP or functional?
Static or dynamic typing?
Expressive or machine-oriented?
Concurrency-oriented or not?

Etc. :)

Now what would be interesting is desiging a simple language *system*
that was trivial to port and use for real work. It's easy enough to design a
fantastic language, but what's more important is to make that language implementable
and useful in a reasonable amount of time. Assume that it has to be portable to
various embedded systems and game consoles and used for real work on those systems.

a simple language *system*

Given the affinity of LTU for domain specific languages, the present topic of joining together on one language is somewhat at odds with our common interest.

I second the interest in a language system. But perhaps existing tools are enough and a system isn't needed. It's a little puzzling how rarely LTU mentions the usefulness of specific tools, libraries, boilerplate, and architectures for implementing DSLs.

It's much harder than you think

One of the reasons to study something, and programming languages are no exception, is that you learn that things are often more complicated than they might first seem.

I think this is one of the fundamental reasons for unproductive threads like the one about C++. After spending time and effort sutdying PLs, we know just how hard it is to produce optimizing compilers, which language features are relevant, and how much of the effort consists of non-creative grunt work.

We know how many languages are out there, and know that designing something better, and better supported isn't easy (recall the Links thread, from awhile back - we weren't very appreciative, and the people involved are Language Gods).

We know how important language communities are, and know just how hard it is to create a viable language community, and the large part luck plays in this.

Etc. etc. So unless we are talking about a specific need, or a specific language feature we are interested in, there's little chance of us building yet another toy language.

(And oh. There are some great languages out there. Not perfect, but quite good. It's just a matter of knowing were to look.)

It certainly is hard...

to get things right, but my feeling is, looking at my longtime favorites Scheme and Postscript, that it can be done to design a minimalist language with amazing expressiveness. E.g., a sketch of a Postscript interpreter in Postscript:

{ currentfile token { exec } { exit } ifelse } loop

C, my home turf, sucks somehow, but it is a usable mix of machine-language features and features that let me keep the overview, having a textual code representation that reflects the nesting of structure and reducing the need for arcane details like segmentation, labels, argument passing in registers and on the stack etc. Deep inside however, there is a wish in many people including me to keep control of these features as well.
A failure to modularize and abstract, if you wish.
Otherwise, C wouldn't have survived.


(And oh. There are some great languages out there. Not perfect, but quite good. It's just a matter of knowing were to look.)

The number of greater experts than me in this forum could perhaps send the less enlightened of us who agree upon the wish for an expressive, extensible, self-modifying, bare-metal language along the right search path.

wish list!

Great idea! Since I've been trying to do the same thing, as a novice, I have a few suggestions:

1. Document this process very well. Well enough that an average mainstream programmer can learn from it (average mainstream programmer will be someone using java to do jdbc or c# to do web pages for a human resources department, etc.)

2. I totally agree with keeping the compiler very simple

Many languages have tuples/records for basic datatype which can not be extended (add/remove), it seems this would be useful where you have 10 attributes but only want to deal with 3 of them.

3. If we end up with a scheme like language, I suggest adding types, pattern matching, making regular expression part of the 'official distribution.' Basically learn from not just the latest functional languages designed by PL experts (O'Caml, Haskell, Clean) but also languages in wide use (Perl, Python, Ruby, Javascript and obviuosly Java, C#)

4. Syntax! If we end up with scheme/lisp type language then this point doesn't matter. But I personally would LOVE to have a language that tries to keep C/C++/Java/C#/Javascript syntax! I know many people don't like it (I guess I'm just used to it), but absoloutly huge number of programmers would be more likely to look at this language. The same can be said about VB...but I don't care for VB :)

I would suggest that Javascript be looked at in detail. It has a syntax of the C family, but has things like lambda functions, lists, etc.

5. More on syntax: (lisp like syntax extension for C):
Providing TXL like syntax extension for non-lispish language might be a good idea as well. Since compiler is suggested to be very simple, such syntax extensions would allow developers to add things liks list comprehensions as easily as one imports any Java package into a source file.

I would also suggest keeping certain problems in mind while designing the language (and syntax). For example, implementation of B-Trees looks very easy on paper, but are surprisingly difficult in code...is it because of limited expressivity of languages? Simply do away with object/relational mapping problem for vast majority of cases (for example Clean language's List comprehension even allows Join's! ... why not extend it to provide group bys, cube bys, etc.?)

Any way, these are just some quick thoughts.

Many languages have

Many languages have tuples/records for basic datatype which can not be extended (add/remove), it seems this would be useful where you have 10 attributes but only want to deal with 3 of them.

I am not sure what you are trying to describe here, but it sounds like subtyping might solve the problem. O'Haskell has a feature named record stuffing, it maps a selector name to an identical variable already in scope.

I totally agree with keeping the compiler very simple

This puts quite a few constraints on how your language is designed. Optimizing Compilers for Structured Programming Languages is an excellent read.

Guarded Single Assignment form

GSA sounded neat, but newer versions of that Oberon compiler don't use it.
oo2c post on using SSA.

C++ done right?

Rather than recreate a whole new language, how about just hacking C to be more Haskell-like?

This is probably a slippery slope where we end-up with something like a strict Haskell eventually, but how about gradually trying to clean-up C and adding the easy features that allow the programmer to retain the same mental model of the program, but make the code "cleaner".

Some suggestions:
1. algebraic data structures and pattern matching
2. type classes
3. currying
4. lisp-like lists (with reference counting for GC) and tuples

The idea is to have something that can be easily compiled down to standard C which would allow existing libraries to be used without a non-trivial FFI.

Some of the things on my list above (currying?) might turn out pretty hard to compile to C: so these are just suggestions.

More ambitious additions would include closures and a dynamic universal type.

The goal is to have programs that are very similar in spirit to C programs except they're more concise and easy on the eyes.

Although this has limited value for the PL theorists out there, in practice I can see such a language being very useful, and easy to adopt since it's basically C with syntactic sugar.

That's what I was thinking...

...though rather than "C++ done right", I would describe it as "Stepaunov's dream language". Think of it as a language designed for implementing the STL in the most elegant way possible.

The language would be an imperative language with a Milner-esque type system, but have language support for concepts, which would look like type classes, only more so (e.g. dependent types), and have true dynamic binding (unlike C++ templates).

The idea is to keep the stuff that makes C++/Ada programming cool, but lose the historical baggage (from C/Algol respectively) and removing the (in retrospect) arbitrary restrictions.

Oh, and algebraic data types are a must, though I could personally give lisp-like lists a miss. If you want a modern Lisp variant, you know where to find it. :-)

more so than Felix

That reminds me of Felix and not just because it's from Australia. It's a "C++ done right" with an ML style type system, and AFAIK without the power of concepts. Going beyond that to Stepanov's dream language would be both interesting and practical.

I'm thinking in a similar direction..

What would you think about macros? Good, bad?

Macros

C++ Macros ? Bad.
Lisp Macros ? Good.

Just my two cents.

Macros...

As David Teller said: "Lisp macros good".

For a C++-like language, I think the direction to look in is something like lazy template instantiation, only more macro-like and with a nicer syntax.

Thanks

Yep, those were the ones I meant (how dare I post about preprocessor macros on ltu :o).

Rather than recreate a whol


Rather than recreate a whole new language, how about just hacking C to be more Haskell-like?

That's one of my lines of thinking.

My list of modifications:

  • Do away with the distinction between expressions and statements.
  • Prep up the logical operators to be more Perl-like: the value of a&&b should be b if a is non-null. I just love Perl's open "file.pl" or die;
    In Javascript, a couple of things have been done right in this respect.

  • A basic way to explicitly collect the upvalues for lexical closures, e.g. in a nameless aggregation of values (an ad-hoc struct that is allocated on the heap and expressed with a lean syntax), in conjunction with one or more nameless ad-hoc functions with variable references into the aggregation.
  • Scheme/CLisp-like macros acting on the syntax tree of the parsed input language that can annotate types of expressions and expand macros with information about expression types. This would require a simplification of the syntax (see expr/stmt above)
  • more...

why not c+?

Basically take C, and add to it to make it a better:
-Continuations (still not sure exaclty what this is but seems important :) )
-Better threading
-Better pointer system (perhaps safer?)
-Higher order functions
-Better native array system (so if programmed correctly APL would have no performance advantage over C ?)
-Better memory handling (some sort of garbage collection?)
etc., etc., etc.

Keep it a good language for systems programming, but extend it to be a better 'intermediate representation' language. That way DSL could be built on top of C without too much hassle and we could have mini-C++ for specific domains...no?

Continuations in C

C has something already built in with setjmp() and longjmp(). Originally intended for servicing asynchronous interrupts which is a form of continuation.

And for lack of a better place to put it, the one language feature I'd really like to see more languages build support for: Icon's concept of Success and Failure. (Griswold knows strings).

C non-local exits

A longjmp() is an upward non-local exit to an earlier point defined by setjmp(), unwinding the C stack only, and can't support "winding" the stack in a downward jump, as provided by general continuations.

Calling a continuation is a lot like jumping up or down the stack to a spot saved when a continuation is created, and there are about as many stacks as continuations (... which share structure in languages like Scheme, where a stacks is one path in a forest of heap-based environment frames.)

A C jmpbuf initialized by setjmp() basically captures all register values -- including top of stack -- that ought to be restored by a later longjmp(), which pops the stack by virtue of restoring an earlier copy of the stack pointer. The behavior is the same as C++ exception throwing, except it's unfriendly toward finalization since destructors (for example) won't be called between the longjmp() and the second return from setjmp().

Rather, let's each make a different language

Wouldn't something like the Indie Game Jam or 24 Hour Comics Day be more fun? I don't know if a "24 hour language" is a reasonable goal, but I'd actually like to try it sometime.

That sounds like fun

But let's start with a dsl. Pick some suitably interesting domain, and see what we can each come up with.

I think it is going to be instructive to see how each comes up with a language that reflects his own background...

Sure does

Any idea ?

CA

I could use a good DSL for desrcibing (and simulating) cellular automata...

Pynchon

I've messed around with this a little bit, but... I'm by no means an expert at the whole language thing. I was just bored in English class.

Pynchon is what i thought up. Feel free to tell me it sucks.

real world DSL

I'd like to suggest to pick a DSL that is unrelated to computing. This would ensure that there is no experience or any preconceived ideas about what the DSL should look like.

Er...

Do you mean a DSL related to shopping or visiting a zoo ? I'm not sure I understand your "unrelated to computing".

Not having computer-based computing as its target domain

...e.g., a DSL for financial instruments, as opposed to a DSL for describing lambda expressions :-)

I wonder, if whenever a domain of DSL is computing, it really becomes a GPPL?

yes

"Do you mean a DSL related to shopping or visiting a zoo ?"

Something like that, yes. How would you like to program a robot to do a days household tasks or something.

It's about the entire environment imho

I think there's no one language that fits all...

I think there's much more a need for an extensible computing system, and environment which means a compiler/runtime/base semantics which is extensible enough to build new languages/semantics on it. An example of such systems are the LispOS variants.

I'm intrested in Slate/SlateOS (the OS is only a plan currently) myself, where the base semantics/language is a prototype based multiple dispatch OO langauge with Smalltalk syntax. But of course on top of it you MUST be able to build new languages/semantics like a pure functional one when the problem desires it. The important thing is to have an extensible platform with compiler bulding blocks, etc... where you can very easily integrate the different semantics.

No more C FFI's, GC trouble, and all these problems that exist only due to the fact that C is the base semantics where the different languages/systems meet.

The creator of Slate thinks of it as better systems programming languge then C, even tough it has an excellent macro system, and other high level stuff.

Hope my wording is not too obscure...

Thanks all for the input.

Thanks all for the input.

How about a language to end the ongoing dynamic vs. static saga.

Did you know that C++ can be fully dynamic? using operator overloading and templates, a special 'variant' class can be created that accepts anything as its value and can be used in any context. I am saying this not to praise C++, but to show that a language can be programmatically made to be static and dynamic. Here is an example:

Variant v = 10;
Variant x = v + 1;
x++;

Self-modifying code and macros are a must

I don't know about self-modyfing code (i've never needed it - does anybody have good experience on it?) but my ideas revolve around macros as THE basic mechanism for this language.

I don't think the LtU membership could agree on enough details to produce one language (in order to satisfy everybody, it'd wind up like PL/I, but less concise).

It depends on our maturity. I think it will work on LtU, because members seem mature enough. I've tried in other sites to do a collective exercise (for example, allegro.cc) only to fail miserably because people failed to co-operate.

It could be interesting to have the discussion so like-minded people can get together and make several languages.

Why not house all these languages under one roof?

It might be better to ask what problem would such an LtU language solve?

all problems, in a more elegant, shorter and better way.

Who would be the users?

everyone.

Why would they choose this language rather than an existing ones with large communities and codebases?

If it turns out to be truly better than other languages, then why not?

Fundamentally OOP or functional?

Static or dynamic typing?

Expressive or machine-oriented?

Concurrency-oriented or not?

All the above. It's gonna be user-defined, anyway.

The idea is to have something that can be easily compiled down to standard C which would allow existing libraries to be used without a non-trivial FFI.

My idea, exactly.

I think there's no one language that fits all...

why not?

C++ always feels like several languages to me

Each with a kind of personality of it's own. There's C lurking in the background as a close-to-the-metal language. Then there's Simula strapped on with classes. Then there's generics strapped on with Templates. Then there's.... and on and on.... It can be both a blessing and a curse to be multiparadigm. The point is not whether it can be done, but how seamless and natural the transition is from one paradigm to the other.

The designers of the type inference know that dynamic programs represent a subset of statically typed programs wherein a dynamic variable is really a variable that is an intersection of declared types. Even knowing that, it's very hard to bypass the type engine because you have to be explicit in defining the intersection - continually adding new types to the data type declaration. And also of having to intersperse the constructors every time you actually want to manipulate the variables.

Ok, so C++ can use a variant type, but I don't think it's type system qualifies as what I'm aiming for. Or to put it in a different light, what if C++ was a language with templates being the only part of the language that you actually use.

Ok, here is a first suggestion.

First of all, the language should have to be built upon s-expressions. Now that sounds limiting at first, but it does not ought to be. With the power of macros, s-expressions can be transformed to whatever one desires.

For example, infix notation can be done using macros (I am using square brackets instead of parentheses because I am too lazy to press shift while typing):

[macro [x + y] [+ x y]]

Each time the compiler meets an expression of type [x + y], it is automatically translated to [+ x y].

As you can see, the concept of macros is foundamental. In fact, little more than a good macro system is needed. Everything could be defined with the help of macros.

macros can be used for any 'sugar syntax'. For example, one could use '.' for declaring membership:

[macro [x . y] [x y]]
[macro [x . y . z] [[x y] z]]
[macro [x . y . z . q] [[[x y] z] q]]

Macros make it possible to write code that is remarcably similar to other structured languages, without too many parentheses. For example:

[function sum [int x int y] [return [[x + y] / 2]]

Secondly, the lowest level of programming would be machine code. Not even assembly! with the help of the macro system, one could define meaningful representations of instructions and/or translations to the native instruction set.

For example, the instruction mov EAX, EBX could be defined as:

[macro [mov EAX EBX] [10 40 50 60 127 125]]

Above the machine code level, a virtual instruction set should exist. This virtual instruction set would be the one that a possible virtual machine is built upon, or translated to the native instruction set. Again, the macro system would be used for translating this 'virtual instruction set' to a real instruction set according to the imported library. For example:

[macro [mov R0 R1] [mov EAX EBX]]

The power of macros is so great that it can also be used to provide dynamic processing of language constructs, during compile time. The basic compile-time data structure will be the list, since it has been proven to be very versatile in most advanced languages. The compile-time processing will also have numbers and strings as data types. For example, a macro could output code according to parameters:

[macro [add x y type] [
    if [type is int] 
    then [add-int x y]
    else [error 'unknown type for add']
]]

Macros should be allowed to be defined within macros. For example:

[macro [define x] [macro x]]

Above the virtual instruction set level, a C-like structured programming modelshould be defined. The basic data types should be:

  • signed and unsigned 8/16/32/64 bit integers
  • 32 and 64-bit IEEE floats
  • bool, string (unicode), char

The basic constructs of the C-level should be:

  • structures
  • variables
  • pointers
  • procedures/functions

The C-level would make it possible to directly invoke C and operating system routines without too much hassle.

Above the C-level, an object-oriented system with Java/C#/whatever capabilities should be made. By using the macro system, it would be dead easy to translate a class declaration to a set of data structures / procedures.

Above the object-oriented system, a pure functional system would be made, with all the goodies of functional languages.

Macros can be used for doing various tricks, for example declaring method calls or accessing structure members. For example, a struct declaration could define macros that correspond to accessing a particular part of a memory block:

[macro [struct name members] [
    [for-each member members [
        [macro [base [name-of member]] [[address-of base] + [offset-of base member]]]
    ]]
]]

In case you are confused, the above piece code creates one macro for each member of a struct. All the macro does is to construct a reference to the proper address of a struct according to member byte offset.

The type system could also be constructed out of macros! since the concepts would be totally user-defined, types could be totally user defined. By using macros, one can 'define' type names, and place code in macros that check if expressions yield similar types, thus enforcing a static typing mechanism where needed.

Furthermore, macros can be used for enforcing correct implementation, by examing code and applying various tests in it. For example, a macro could be used that no assignment takes place in a piece of code, by emitting a compile error for each 'destructive update' found in a piece of code.

Finally, optimizations can also be applied by using macros that examine the code and alter it. For example, constant folding can be done by examining each expression and removing from it constants by substituting them with another symbol.

Conclusion: the compiler can be reduced to a macro-processor and symbol-manager. The basic operations of the compiler would be:

  • declare macros
  • export symbols
  • import symbols
  • declare macro variables (as temp storages used for compile-time processing)
  • managing namespaces
  • doing arithmetic, logical and other operations

All other functions can be programmed on top of that.

foo

"For example, infix notation can be done using macros"

and then, you give a math expression as an example, which is really the only time you'll ever possibly want infix notation in programming: so that it looks familiar to math, even though you'll be writing 1 + 2 + 3 + 4 + 5 at the expense of a simple and concise (+ 1 2 3 4 5).

Besides, let's not forget you'll have to call the macro name first, so, it's actually, say, (math (1 + 2 + 3 + 4 + 5))...

"the compiler can be reduced to a macro-processor and symbol-manager"

this sounds a lot like m4 producing c code.

Plenty of interesting ideas but...

The premises are interesting. In particular, I fully agree that "clean" (AST-based rather than string-crunching) macros are primordial. I have a few criticisms, though.

  • Going down to actual assembly level is really looking for trouble, if only because 686/PPC/Mips are rather different from each other and making cross-CPU programs is going to be hellish.
  • Piling-up so many macros on top of each other looks like a good idea in theory but it might end up like C/C++ libraries: drowning under macros. I don't have a good idea to solve this, though, nor do I know whether it is a real concern.
  • Making a type system for something so low-level might be quite difficult.
  • For now, this is a language without purpose.
  • What about concurrency and garbage-collection ?

I suggest

  • Compiling to some already-existing bytecode structure. Either JVM, .Net, LLVM or anything similar (even C).
  • Giving the possibility of programming the macro system with the exact same language as the program (that's probably what you had in mind anyway).
  • Thinking about one specific type of application for which the language would be useful.
  • Discussing GC and concurrency, as they are quite low-level choices.

I agree with macros

I agree that macros need to be the basis of such a language, since they help blur the distinction between compiler and application. I recommend you look at Pliant for an example of a working system that has this emphasis.

My own ideas have revolved around providing markup for C that would allow things like concurrency and garbage collection, and then attaching a macro processor library to reduce the marked-up code to C. The intent would be to evolve the set of macros from there, and build up layers as you described, so that programmers can choose which of several layers to write code for, but use a single interface to the compiler.

My approach is to market this idea as a portable C library that people can use piecemeal. The library itself would be bootstrapped, so that its code would be written in the macro language.

Interesting aside: I also chose [] as the macro delimiters, to avoid having to hold the shift key. ;)

My home page has inklings of these ideas, but I haven't documented most of them. I'm finally making progress since I learned of ProtoThreads, and am applying it's (ab)use of Duff's device to implement full continuations in portable C.

http://fig.org/gord/

fig.org/gord doesn't seem to work

I get a 'connection was refused' error in firefox

Macro expressiveness

I've argued the issues with this approach of building up everything from macros here, here and here, the second and third explicitly mentioning building from assembly.

The condensed version: What you would actually get if you went this route is simply a collection of compilers, not one coherent language.

compiling to assembly

why not?

I think we have to clearly distinguish between the language and the library.

It seems to me that a compiler does nothing but parse, do some stuff with the AST and then convert it into some sort of (virtual) machine language. I would find it only fair if developers had an influence on this process. A very promising approach, but a little bit too complicated, is the register transfer language approach in GCC (it seems to have changed with GCC 4.0). How would this be packed into a library?

The actual VM language generation should be part of the library, and not the language. Same holds for parsing and the syntax. There doesn't remain much for the language though. But a metacircular Scheme is tiny, too.

And linking, too! I remember a paper where they expressed the places in the object code where replacements had to be made as sexprs. Would be funny to imagine what could be achieved with programmed trampolines etc.

A layman's proposal

I agree with the above posters' statements about having a common itch to scratch. As a layman who loves trying to follow y'all rocket-scientists' conversations and rarely understands half of it, here's what I'd really like to see:

1. Haskell-like syntax -- it's easier to visually parse & understand code that uses guards, pattern matching, &c. than to reason out recursion (which's fine & necessary, but slower to scan when you're reading through a 10kloc server)
2. Mixed strict or lazy semantics controlled at the function-declaration level
3. Massive lightweight multithreading
4. Toolchain-neutral compiler implementation (not just ported gcc)
5. Mixed functional/imperative
6. Record system with object identity
7. Class system like Haskell's
8. Monads & Arrows with syntactic sugar
9. SWIG binding
10. Implicit compile-time generics (at least as good as C++)
11. Standard GUI, even if that means a few extra keywords or constructs
12. Built-in debugging support based on compile switch, based not on the file&line, but file, line1/position->line2/position
13. Ability to use inline C or assembler instead of a FFI, including calling-convention control

In general, the idea is to take all this amazing stuff you guys have access to, and put it in a language and environment designed for writing the type of big, dull, boring programs people get paid for.

A customer for you :-)

From BooManifesto:
A wrist friendly syntax, expressiveness and extensibility. That's not all.
I want my code to play nice with modules written in other languages.
I want a rich programming environment with a well thought out class library.
I want to be able to run my programs in multiple platforms.
Oh, sorry:
I want the CLI.
Oops...

But seriously, there is a long list of features in the manifesto - the problem is, Boo already implements them :-).

Bottom-up Design

Given specification item #2, start from Typed Assembly Language or something like it. Many language efforts have this itch to be "as fast as C" in the tired cliche, but leave that study to last (specification item #1 being explicit on the point). If you want a language that is hardware-oriented, then start there from the beginning.

Saffyre... my take

here goes.. My take: ... I doubt its "Leanness & Meanness" but minimalism isn't everything.. right? in fact its probably not so good in many other ways but that's up to you guys to judge (I'm a novice, thus expecting criticism to be harsh):

As I see it people prefer their own way of wording constructs, keywords like a Class Property are pretty much the same in meaning as Attribute, yet Predicate - a probably lesser known kind of function (with a boolean return type) means something slightly different to Function, or if you like Procedure... Please bear with my Keyword Synonyms, I seem to have an affinity for the Pascal Family (such as Modula, Pascal, Oberon, Delphi) of languages, afterall it was my first language to use, and this is my first to create...
Based on (Component) Pascal, with a touch of of inspiration from Python (and even less from Haskell) I offer... Saffyre:

Note: If you havn't already noticed the Name Saffyre itself is a pun on the Gemstone theme some other languages seem to have followed such as Perl and Ruby, so if Saffyre even turns out to be practical I'm fully expecting someone to come up with a language (called Emryld or something) to show it up :) ...

here's my notes:

Keywords/Structure:

1. Program/Module/Library
- Base Level container (Program = auto Module/Namespace, FUNCTION <Program Name>/main)

2. Begin/Body <Namespace> ... End <Namespace>
- The Namespace bit is optional. For blockable statements (ie If, For, etc) the begin assumed but can be explicitly said, ie. Begin For (unless I decide instead to use C-Style First Line unless Explicitly Declared as a Block - or at the very least Explicitly Declare what is Ending i.e. EndIf;) Unlike Visual Basic no Line continuance character is required, statements end with a Pascal/C like ;

3. Class/Unit
- outside main or if NameSpace main == Program. The Module keyword creates a Static Class
- Inherits <SuperClass>; - multi Inheritance
- Constructors, Destructors
(via New Function/Method, and Destroy Method - for .NET = Dispose)

4. Interface/Include/Uses
- Importing/Exporting Superclass vars/methods - basically == prototyping - [As newname] optional

5. Implementation/Packages
- Package/Task/Event - inc. Overloads and Overrides/Redefine, Can use for Functions/Operators - optionally Shadows for hiding Inheritance

6. Property/Attribute

7. Type/TypeDef <TypeName> Is [Packed] Object/Record/Structure/File/(Array[<range>] of <Type>) =/:= <typedefinition>
- inc. Variant: [& Case] ... End; and Subtype <Name> is <Range/Type Range>;
- Arrays can be Multi-Dimensional

8. Const/Var/Declare
- Creates a New Constant or Variable/Member may be initialized by Assignment (:=) and allows use of New (basically constructor call) for Instantiation of Objects

9. Function/Method/Predicate/Procedure;
- Unlike Pascal Functions don't *require* return values but are encouraged to via good programming practices, also Predicates = Boolean Return Type - which doesn't need to be specified - also Return keyword equivalent to Assignment to <FunctionName>
- Methods are intended to be Class-Member Functions (as opposed to Member Variables) and them having a return type is optional
- Forward/Static/Virtual/Abstract/Limited/External Function Types where Forward = Prototype/Interface, Abstract = SuperClass that Requires Inheritance, External = Defined in other Package/Unit
- Public/Private/Protected/Friend Accessors
- Operator Overloading via [accessor] Function Operator<Symbol>(<operands as parameters>) ... End; syntax - considering making the keyword Operator Optional if practical

Record/Structure <StructName> ... End;
- same as Defining the Type/Typedef - much the same way as defining Class vs an Object either inside or out of a TypeDef statement, personally I prefer both being Record/Structure Definitions of the their parent rather than as a messy type Statement simply because they include clauses of their own... (hmmm the Optional Packed keyword goes before the Record/Structure Statement I guess, just noticed that anomaly)

Raise <Exception/Event Name> ... [Try ...] Catch ... [Else ...;] [Finally ...;] End;
- Although separate I included a Try Statement here...

Requires <Condition>/Ensures <Condition>
- for Programming by Contract

With <Object/Control> [Do] ... End;

If <Condition> [Then] <Statements> [Else ... ;] [Elseif ... ;] End;
- Can use When instead of If for elseless if, or Unless for Thenless If.

Case <Var> [of] <Case> : <Statements>; [ELSE ... ;] End;

For i := <Number> To/DownTo <Number> (or) For Each <i> In [Reverse] <Data> [Step/By <Number>] Do/Loop ... Next [i]/End;

While <Condition> [Do] ... WEnd/End;
- Ok so WEnd was "borrowed" from Basic, but I like it ok plus if you don't you don't have to use it End works just as well! :)

Repeat/Loop ... Until <Condition>;
- self Explainatory

Language <LanguageName>
- For Embedding Inline XML, SQL, C/C++/CS, ASM & other Scripts, Development Environment or in Windows can be any ActiveX Language. Note: Inline XML too, the compiler parses the < symbol when used to begin a "statement" and recognises them as XML because it is outside an expression using it to span to the next > or / a good editor can even match start tags with end tags much like bracket matching... which means XML attributes and values can be used to document (or otherwise tag) code the Language Keyword itself is optional for SQL or ASM, inline XML actually requires its absence for reasons above.. but for anything else, the Language Keyword is required...

Label/Goto <label>

in/out/ref/var
- For Refer to or Pass By Value/Reference

Continue;/Break;/Pass;
- For Flow Control

Self/Super (References), (Sub for Abstact-only?)

Task/Event Keyword for Events(/Triggers?) and Atomic Transactions (& Packages?): Entry, Declare Begin [Select ... Or ... Else] End;
- Commit/Rollback
- Threading Support?
- Also allows Requires/Ensures Clauses

@ operator for Indexing/Dictionary Key
- as alternative to [Index] for Arrays, Queues, Lists and Stacks?, etc or for overloading as custom feature for functions - good programming practices would suggest attempting to keep the meaning of the term 'at' for the symbol when doing so if possible

.. operator for Ranges (useful in Arrays and Subtypes)

Enumeration (Enum Keyword)

Normal Operators: + - * / ^ % & ++ --
Logical Operators: = < > <= >= <> !
Assignment: := += -= *= /= ^= &=
Note: includes handy C-Style Pre/Post Increments/Decrements

Basic Types: Integer, Real/Float, Complex, Character, String, Boolean/Bit
Additional Types:
- Number - tries to simplify numerics and have compiler handle the actual number data type for the user - basically a polymorphic number (Assumed Integer?) that is Type converted as necessary as I suspect the Euphoria Language uses - also all Integer-types have Suffixes for Binary, Octal, Hex, etc such as # and &
- Bit, Nibble, Byte, Word, DWord - for Binary where Bit = Boolean, Bytes are 8 bits, and so on up the scale...
- Register
- Note: Complex Numbers ie, 6 + 5j possibly as a set (Saffyres tuples? - see below)

Unmanaged... + .NET integration
Windows Forms/WebForms
<XML>/SOAP/ASP.NET/Saffscript/SSP (Saffscript Server Pages)
DBConnection, DBCommand, DBReader/Writer
(SQL,OLEDB, etc.)

Sets (Tuples?) using {<list>} syntax
- with Built in: Slicing/Swapping
- NOTE: This kills Pascal Comments, so Line Comments are C-Style // ones, am considering {{ and }} for Comment blocks, if not possibly \/

other Brackets: [] is mostly for Arrays... any other applications if any are assumed to work the same as Pascal does, ()'s are for Argument lists, Precedence/Math brackets, etc... also the same as Pascal, and aren't required for parameterless Function Call.

Literals: Chars 'c' vs Strings 'strings' vs "strings"

Constants: NULL, Pi, e, (whitespace keys), etc.

Libs & Extras:
sorted lists, dictionaries, stack/heap (filo), queue (fifo), etc
Console IO
WinAPI Lib and COM component References
SGML(Mostly XML, includes [S/D/X]HTML, XSL, XSD) & SOAP
Network/Internet Protocols (from TCP/UDP and IPX/ODI to HTTP and IRC)

Funcs:
Math - [a]sin/cos/tan/sec/csc/cot[h], and an sqrt that returns +/- Tuple (for Relations)

Type Casting - Probably done by a Function ala VB's CType FunctionSet.. at least for those already bound...

I may not have mentioned Accessors earlier, if not the usual Public Private Protected and Friend keywords should suffice... its also worth noting that theres a default is its parent (or if unspecified Public?)... and if you wish to block them together using a Begin - End; block makes sense here

Considerations I know less about (some may not even Apply):
Keyword Synonyms - Good thing or bad?
Flexible End; Syntax - ie End; or EndIf; End If; allows Explicit or Implicit End Blocks, of course Good programming practices would probably lean towards Explicit ones.. also as if Function above doesn't have enough Synonyms, considering shorter Function keyword, perhaps Define (so Overriding can be Redefine thus isn't quite as confusable with Overloading) - and possibly removing both the Function and Procedure keywords completely.
Pointers? - Referencing and Dereferencing
Monads and Like Comparison Operator ~=?
Lambda Expressions?
Assignment within Conditions?
Generators and Optional Functional Programming?
Delegates?
Generics?
A Swap Operator like Python and Ruby have, of course a Swap function could do the job... but its nicer to have an Operator. Its easy enough to overload an operator, but I hadn't considered till now creating one not specified in the complier!
Sprite Toolkit/Library (ties in with IDE Bitmap/Icon Editor).
Possible -optional- Begin/End Symbology rather than Keywords, ie, ==> <Code Block> <==
Concurrent For Loops also additional/alternative C-Like Syntax For/ForEach Loops of course using a different keyword?
Code Regions, as XML tag? if not as a "Compiler" Directive or Both
Redirection Operators - <<, >>
Whilst '&' can be an AND operator (or possibly String concatenation) considering ? as an OR operator, other Boolean operators, for XOR, XNOR/EQV, COMP, etc.. (! is already not ... <> is default is NOT Equal, but probably != is an allowed synonym) also uncertain if to commit to a C-like Block = first statement only unless Began/Ended, or requiring End keyword like VB, because Infix is the default notation (Prefix for Unary operations)
I was considering if there was a way to optionally support prefix/postfix notation as well.
Escape Sequences in Strings and String Formatting placeholders for Variables, Personally I prefer the Pascal way, Writeline('This is string 1: ', strString1, 'and this is string 2', strString2); but that doesn't mean a format for variable placeholders and also for escape sequences is unviable... also before I forget to mention it, Null keyword = C's void or vb's Nothing.
Strong-Type, Weak-Type, Duck-Type?
Regular Expressions?
Mixins?
And Finally - Last one... Considering switching the meaning of ; and , end a Line or Clause with a , and separate parameters, lists etc. with ;??? - just seems to fit the Program's End. vs other block End;'s better if its an End, but then my punctuation skills kinda suck anyhow... or perhaps force the compiler to ignore the ',' character as a Statement terminator when used within a span? ie (var1, var2, expression1, expression2) and other spans [] {} "" etc.

Note: I figure if someone is artistic enough the inline XML could be handy for a number of purposes, from Documentation to a File Format Descriptor and of course as mentioned above Code Folding

Note: Ideally for an IDE the ability to incorporate UML Diagrams and a Visual Flow Layout something like Visula and/or Sanscript (at least for objects) and DataFlow/Chart/Pictoral representations for Flow control, Databases and other Data Structures, Documents and other Structures inc Class Diagrams, DataDlow Diagrams, State-Action Diagrams, Database Entity-Relationship Charts, Flowcharts, Heirarchy Structure Charts, Use Cases, Organizational Charts, etc... many can be extrapolated from code or if created first code skeleton (or more) can be extrapolated from the Chart/Diagram, of course also handy would be a simple image editor for Icons, Bitmaps, Sprites, Jpg's etc... But thats a pretty tough ask

Alternate Simple Macro-like Interface for beginners, think MS Access Macros, a list of Basic and Library Functions as if Visual Studio/Delphi (.NET) Toolbox that can be drag n dropped to a sqeuence commands to run with a step sequence and the Action, Parameters/Properties the Quantity (for repetition) but not sure how to handle flow control for that kind of thing - I actually found a programming application once that implemented something like this nicely, but havn't got it anymore plus it was a trial version - VistaTask I think its called... anything to be user-friendly (Can also be used merely as a Code Generation tool)

Of course Another useful tool imho for an IDE is communication to forums and/or IM capability (whether it be IRC, Jabber, ICQ, YAHOO, MSN, Odigo, etc... or multiple)

By Forge (forgeaus@hotmail.com)

Suggestion

I like the idea of designing a language at Ltu. I've considered the very idea myself like probably many of us have. I'd like to suggest the collection of favourable language constructs to start with. So instead of deciding beforehand on technicalities like type-systems, higher order functions, module systems, purity etc., let's start from the language constructs we would like to have in order to solve practical programming problems. What would typical utterances we like to have look like and what would be favourable semantics for such utterances? I'm interested in what an ideal general purpose programming language would look like and what its semantics would be, regardless of any execution engine. After having designed such an ideal language we could try to map this ideal language to some intermediate compiler language and figure out which language constructs are reasonable and which are not. Then iterate until practicle.

The Spirit parser framework - metaprogramming and no macros

On the topic of metaprogramming and macros - there is one excellent example of metaprogramming with minimal macros - the Spirit parser framework (part of the Boost C++ libraries).

The developer who created Spirit is very much against macros -
(he says so very forcefully in the docs) and I agree with him.

Anyway - in response to the "let's make a programming language"
challenge, rather than my suggesting a completely new language,
I'm keen to see Orca (the open-source Rebol clone) developed.
At present, it's still pretty basic, but has great potential
(given that Syllable will now use it as their default scripting
language).

Of all programming languages, only Forth comes close to the "power-
to-weight" ratio that Rebol has. Python, Ruby, Perl - all good, but
pretty heavyweight. One of Rebol's best aspects is that DSLs are
trivial to create, using "dialects". So, there's the "create your own
language" thing taken care of ... :-)

Isn't Spirit written with

Isn't Spirit written with template metaprogramming techniques? These are C++'s ad hoc version of macros. We aren't talking about the C preprocessor here. Or do you mean the author limited his use of template metaprogramming?

Some words of advice on language design

Before you go off inventing new programming languages, please ask yourself these questions:

  1. What problem does this language solve? How can I make it precise?
  2. How can I show it solves this problem? How can I make it precise?
  3. Is there another solution? Do other languages solve this problem? How? What are the advantages of my solution? of their solution? What are the disadvantages of my solution? of their solution?
  4. How can I show that my solution cannot be expressed in some other language? That is, what is the unique property of my language which is lacking in others which enables a solution?
  5. What parts of my language are essential to that unique property?

If your answer to 1 is "it's cleaner", go home. If your answer is "it has this very small core which everything is definable from," nobody cares. (Well, I might care, but you will never convince me it's interesting without some mathematics. There is already a one combinator basis for untyped computation. SKI was known decades ago. For typed languages, it's more complex but also mostly pointless.)

If your answer to 2 is, "I will write programs in it after I have a prototype," then you have not thought very carefully about it; also, your feedback cycle is too long. If your answer involves more than one buzzword, you are kidding yourself.

If your answer to 3 is, "I don't know," then you don't know enough. There is always more than one solution. (Trust me; I never write, "the solution to this problem is..." anymore, and when I read it and I don't see a proof of uniqueness, the writer invariably turns out to be full of it.) If your answer involves only languages of one paradigm, likewise. Go study Scheme and Prolog and ML and Haskell and Charity and Lucid Synchrone and OBJ and Erlang and Smalltalk. Look at Epigram or Coq or HOL or LEGO or Nuprl. Aside from Java, these are the important ones. If you are familiar with all of these, then you are in a decent position. If you have only ever programmed in C/C++/Java and Lisp and scripting languages, you have been sitting in a corner your whole life. Perl, Python, Ruby, PHP, Tcl and Lisp are all the same language. (Scheme itself is only interesting for hygienic macros and continuations.)

If you don't have an answer to 4, then your solution belongs in a library, not a language. In fact, the people on LtU are the perfect people to design good libraries, and libraries, on the average, are much more valuable than languages. Library design is also easier, and you won't waste as much time on syntax. (You will waste time on syntax. You will waste almost all your time on syntax.)

If your answer to 5 is "only this and this", rip those out of your language and add them to an existing language. ("Refactor mercilessly.") If your answer is, "almost everything contributes," I can almost guarantee you you are wrong. (If you aren't, then you are probably a researcher.)

The reason most new languages are pointless is that people rarely answer these questions honestly. That is why language design is hard, and why researchers hardly ever make new languages.

Oh, and, I know this will fall on deaf ears but: don't indulge in syntax design. Pick some other language's syntax style. Java, Lisp, Python, Haskell, it doesn't matter. Just get it out of the way and close the matter immediately. If your language's original contribution is syntactic, you are hopeless.

Think about language features in terms of asymptotic complexity — not of space or time, but of, well, complexity. (I would say semantics, but that's a dirty word.) Syntax changes can only reduce that complexity by a constant factor. A good language feature changes complexity from n-squared to n or to n log n. It increases modularity by localizing something which was global. The best features do so without compromising any other desirable properties (such as type safety).

I would also add that there are more opportunities to innovate in typed languages than untyped, and concurrent languages than sequential. (Personally, I think the only interesting thing in sequential, untyped languages would be something involving delimited continuations or narrowing or compilation.)

Here endeth the lesson...

Wow

You completely rule.

Arrow

I've seen you mention your programming language before, but never any real details on it. What problem does it solve? Is it mostly type system related?

Re: Arrow

I've seen you mention your programming language before, but never any real details on it.

Yes, mostly because I cannot honestly answer all the questions I put forward above. I will do my best, though. (I can't give full answers here, anyway, though; it would take too much time.)

What problem does this language solve? How can I make it precise?

The most important problem I'm trying to address with Arrow is the fact that ML-like algebraic datatypes are not abstract or compositional enough to support interoperability on a large scale.

For example, I can define a datatype of naturals in a number of different ways. One as data Nat = Z | S Nat. Another is data Nat' = Z' | S' Nat. These differ only naming but obviously have the same models. Another is data Nat'' = Z'' () | S'' Nat''. This one just has an extra unit. Another is given by Succ Nat where data Succ a = Fresh | Old a. Here I have factored the datatype into two; but when I compose the factors, the result is only isomorphic to Nat, not equal. By applying isomorphisms in this fashion I can obviously come up with an infinite number of ways to represent the naturals, all unequal, and therefore not interoperable.

But in fact I've only just started. The representations above were all numerals in unary base, but that's very inefficient. Better is something like [Bool], which is a representation in base two. And of course I can do all the things to this that I did above. But why stop there? How about base ten, [Ten]? Or [Either Five Five]? Or all compositions using functors only isomorphic to []? Other factorizations? And so on...

Here is one reason this is problematic: a realistic program is one that uses libraries from many disparate sources. For them to interoperate, they need to use the same representations. But libraries developed in isolation are not likely to do so. At the very least, people will choose different names.

The way I am trying to address this is to make the type system know when two types are canonically isomorphic -- this means roughly that it knows when two types are mutually coercible in a unique way (no guessing -- there is exactly one coercion, or none). Then I add constructs which let you define "generic" functions that don't care, say, if they get as input a [Bool] or a Nat, only that they fall in the equivalence class (what I call a "genus").

More precisely, the way I do it is by identifying a basis of types from which all others are generable by canonical isomorphism. These are "type values"; all the result are "type redexes". If two types have the same type value, then they are intercoercible. This is enough to generate a coercion between them, but using that coercion costs you run-time, and I think you can do more.

What I want to do is classify types according to kinds which denote different bases. Then there will be a functor from one kind to another which corresponds to change-of-base as in linear algebra. Natural transformations for such functors should be the generic functions I am looking for, and I think that the change-of-base can all be done statically, essentially by normalizing a program's types at compile-time. It is a bit like partial evaluation, but without the binding-time analysis.

Mind you, the problem I described above is only one application of this idea. I could give you four or five others.

How can I show it solves this problem? How can I make it precise?

Formal methods.

Is there another solution? Do other languages solve this problem? How? What are the advantages of my solution? of their solution? What are the disadvantages of my solution? of their solution?

Yes, there are other solutions, if you compromise on one point or another. The most obvious thing is to promulgate standards, but standards require cooperation, and the problem I am trying to solve is one where parties can't cooperate because they don't know of each other. Also, there is the adage that there are so many standards to choose from; given two choices of n standards, you have n-squared combinations, so this doesn't scale.

You can go in the direction of dependent types and strengthen type equality. But this is a variation of the standards argument. Dependent types exploit type equality, and, for example, cartesian products are generally not associative up to equality, yet they are absolutely fundamental. As you force more and more types to be equal by adding conditions, you get things which look less and less like sets, and type checking gets harder and harder. One advantage of my approach is (I conjecture) that it supports principal type inference.

There are module systems like ML. These are variations on dependent types. Module systems have exactly the problem I am solving, namely that when you apply an opaque functor, you lose the representation type. ML solves this with type sharing, which is a type equality assertion. Sharing assertions are tedious to write, and break abstraction: they force representations to be equal, which is in practice often not what you want. The problem is that signatures only tell you types, not laws. Also, most module systems are limited by the fact that you can only instantiate a finite number of functors in a program. And there is polymorphic recursion...

My approach has disadvantages. First, I only know at present how to handle very weak sorts of isos, although I think I know how I can handle more. Second, all the information is structural, so you have to program "typefully". You will never get the compiler to infer, for example, that file handles and files are one-to-one. Third, my approach seems to require bringing in a lot of other apparently unrelated stuff like linearity, and it is very abstract (though I hope that it will seem obvious once you see some examples).

How can I show that my solution cannot be expressed in some other language?

Again, formal methods. Showing there exists no translation, etc.

That is, what is the unique property of my language which is lacking in others which enables a solution?

There are several interrelated things. First, the type system needs to know what is a type value and what is not and how are they related. Second, it needs to understand the difference between a type function and a functor/monad. (Haskell datatype declarations don't give enough information for this; and Haskell type class notions of Functor and Monad are not guaranteed to be actual functors or monads.) Third, it needs a notion of datakinds, functors which are not endofunctors and their associated polymorphic functions. I have heard Omega has these things, but I think they are built on equality, and I have not had time to check it out.

What parts of my language are essential to that unique property?

I cannot say yet, as I have not formulated a solution.

There... not too dishonest...

(Please note, by the way, that untyped languages suffer from exactly the same problem but have no hope of ever using such a solution, since the whole thing is enabled by static typing. This is why I always say that typing increases freedom -- typing means you don't have to depend on other people's cooperation. It does not matter how they represent things, as long as you get a static description of it.)

Update: I want to give you another application because, although it is trivial, it suggests the possibilities that open up with iso-equivalence. People often complain that Haskell does not support extensible sums or extensible products. But, it does, of course, in the sense that you can always add constructors to a datatype T by defining them in S and using Either T S as your extended type. Or you can leave a "hole" in your type by adding a type parameter; so you have a datatype T a and extend it with S a getting T (S a) which can be extended again. (This is how objects are encoded sometimes.) But no one sees this approach as a solution, because it is a real pain to dig past the extra constructor to get to your extended ones. If data T a = C1 Int | C2 a and data S = C3 Char | C4 a, they don't want to pattern-match on C2 every time they need to get at C3; they want data T' = C1 Int | C3 Char. But this T' is iso to T (S ()), so generic functions in my sense treat them the same, and you get mostly what you want, without compromising type inference.

Unique, Rare Coercions

know when two types are canonically isomorphic -- this means roughly that it knows when two types are mutually coercible in a unique way (no guessing -- there is exactly one coercion, or none).

Perhaps not the best place (or time) to mention it, but unique coercions don't exist for the vast majority of types. Given [Bool], for example, an integer can be represented big-endian, little-endian, various arbitrary patterns of bit orders, etc. You can't find a unique coercion even between (zero | one) vs. (false | true).

Your goal looks useful. Now, two years later, do you still have reason to believe your approach is taking you there?

Frank's underspecified problem

What canonical isomorphisms there are depends upon what morphisms his category of datatypes has. Given enough morphisms, two unparameterised datatypes are isomorphic iff they have the same cardinality, and so only pairs of datatypes of cardinalities 0 and 1 have unique coercions. Note (i) that Frank says he can only handle "very weak sorts of isos" and that (ii) there is no unique datatype representing integer.

And observe that Frank hasn't been active here recently, so don't expect him to reply.

If syntax is unimportant...

then how come it so frequently lead to religious wars between language lovers?

Oh, and, I know this will fall on deaf ears but: don't indulge in syntax design. Pick some other language's syntax style. Java, Lisp, Python, Haskell, it doesn't matter. Just get it out of the way and close the matter immediately. If your language's original contribution is syntactic, you are hopeless.

A semantic by itself has to be mapped to a syntax so as to be communicated and is worthless by itself. Humans spend so many years assigning (more or less fuzzy) meaning to words, sentence structures and simple mathematical expressions (arithmetic), that choosing an arbitrary syntax that ignores or violates this common body of knowledge seems really wasteful of the efforts spent in designing clean semantics to me.

I think that both a clean semantic and a good syntax are necessary. In my mouth, a clean semantic is one that is expressed with mathematical rigor and a good syntax is one that:

  • is as close to semantics as possible
  • departs from previously used syntax only when needed to get closer to the semantic

And by "previously used syntax" I don't refer to the many accidental syntaxes invented in this or that programming language, but to what everyone learns at primary or high shool.

I think that Haskell and Pico are good examples of carefully chosen syntaxes (as opposed to Scheme that focused quite exclusively on semantic) and that their authors do not adhere to the "syntax is unimportant" school of thought, though they cannot be suspected to ignore the importance of clean semantics. See Pico: Scheme for Mere Mortals:

the expressiveness and elegance of the language semantics should not trivialize the external syntax to the point where it becomes unattractive. On the other hand, syntactic constructs should not invalidate the semantic power of the language.

Even among mathematicians, some, like Eric Hehner do not treat syntax disrespectfully. See A Practical Theory of Programming, p.203:

Whenever I had to choose between a standard notation that will do and a new notation that's perfect, I chose the standard notation.

Many mathematicians consider that curly brackets and commas are just syntax, and syntax is annoying and unimportant, though necessary. I have treated them as operators, with algebraic
properties (in Section 2.1 on Set Theory, we see that curly brackets have an inverse). This continues a very long, historical trend. For example, = was at first just a syntax for the statement that two things are (in some way) the same, but now it is an operator with algebraic properties.

Indeed, his work shows that it is possible to have a clean semantic, a syntax which is as close as possible to the semantic without contradicting the knowledge and intuitions that people spend so many time learning at primary and high school.

Down with (= (+ 1 1) 2) !

Concrete or abstract syntax?

I just skimmed through the Pico PDF you linked, and the first thing that I noticed is how unintuitive ":" and "::" seem to me - I would have switched their meaning for two reasons - first, to encourage immutable names by having a shorter syntax for them, and second, to parallel the use of "=" for referentially transparent operator and ":=" for a destructive operator.
This is not to diminish the achievements of the authors, but just to show how difficult is to please everybody with some concrete syntax.

My personal belief is that good semantics is sin qua non, while reasonable syntax is very important, but secondary to semantics (so probably I agree more with you than with Frank).
OTOH, the importance of syntax heavily depends on the target auditory - for research PLs concrete syntax is not even required (though abstract syntax obviously cannot be dropped).
I therefore assume Frank meant concrete syntax in his lecture.

Maybe it's possible to get rid of this distinction...

without losing readability. IMO, LISP/Scheme managed to get concrete = abstract syntax but lost readability by mere mortals. Pico is an attempt to not lose this important feature. I personally am playing with ideas to map clean semantics to sentences as we all know, using parenthesis only where necessary to disambiguate things.

[...] how unintuitive ":" and "::" seem to me - I would have switched their meaning for two reasons - first, to encourage immutable names by having a shorter syntax for them, and second, to parallel the use of "=" for referentially transparent operator and ":=" for a destructive operator.

Agreed. Actually, I think a programming language targeted to the masses should refrain from introducing new symbols and prefer words for semantics that do not map to some existing symbol (for instance, "set" for assignment).

Regarding "research PLs", I understand the need to not "waste" time on syntax but one as to know that in the end, it will restrict the dissemination of the underlying ideas to the narrow audience of people who have time to spend learning the ad-hoc notations.

Wadler's Law

If syntax is unimportant...

then how come it so frequently lead to religious wars between language lovers?

Wadler's Law

Interior Decoration

Language discussion often reminds me of "Trading Spaces", or "Flip this House"

Which is probably just a lower bound!

My point is not to call for wars on syntax matters. This would be a total waste of time in the absence of objective measures of a syntax "quality" or "relevance".

My point is that syntax is important. Just that. You won't explain anything new to someone until you find the words that will "ring a bell" to him. You can use any number of λ you want in a paper targeted to PL researchers. But you'll have to find something else if you want your ideas to be understood by a wider audience. And disseminating ideas is the point of research in the first place. Or is it not?

My interpretation of Wadler's law is just that syntax is not a topic to be designed by committee.

Wadler's law

Wadler's Law is a version of Parkinson's Third (I think) Law.

* THE LAW OF TRIVIALITY: The time spent on any item of a committee's agenda will be in inverse proportion to it's importance.

Or as the old joke has it.

Because the Stakes are so low.

WW

If syntax is unimportant

If syntax is unimportant then how come it so frequently lead to religious wars between language lovers?

So you are saying that if you love something, you are automatically an expert in it?

Do you want to know the truth? I don't think there are very many people who "love" languages. Most of the people who claim to be PL aficionados are not PL aficionados but syntax aficionados. The part that is not syntax flies right by them. The syntax which is supposed to make the semantics more transparent instead makes it more opaque. Not because it's a "bad" syntax, but because it's a "good" syntax!

The problem is that "bad" and "good" mean different things to the two camps. For the people who value semantics, a syntax is "good" if, as I said (and you also suggest), it makes the semantics clearer. But to the people who value syntax, a syntax is "good" not by virtue of its relationship to semantics, but rather by its novelty or, conversely, by its familiarity, or "cleverness". For them, syntax is an end in itself.

I strongly suspect that the people in the latter camp, furthermore, are the same people who think that PLs are all the same because they are all Turing-equivalent. They think therefore that what differences there are among them must be syntactic. They don't understand that a two things may be F-equivalent but G-nonequivalent, for whatever notions F and G, or that equivalences come in various degrees of fineness.

I mostly agree with your post, and indeed I know and respect several researchers who were part of the "Squiggol school", which emphasized choosing notations carefully so as to facilitate formal manipulation. However, on the whole I have to say that I think the design space for syntaxes is much smaller than for semantics; so much smaller that semantics makes it insignificant. When it comes to text, you have only so many symbols and so many possibilities of putting this here or there. And when it comes to PL's, as I've argued before, you are fundamentally limited by the fact that languages must be parseable, whereas semantics can be Turing-complete.

And as for Scheme versus Haskell, syntaxwise I prefer Haskell, but I don't really care much.

Not even remotely close from a reasoning

If syntax is unimportant then how come it so frequently lead to religious wars between language lovers?

So you are saying that if you love something, you are automatically an expert in it?

It was just meant as a thought provoking sentence, not even remotely close from a reasoning. Thanks for the opportunity to clarify that.

Use cases

I think you are a little unjust to "syntax afficionados" and you tend to ridicule programmers ( There is a latent aggression in your postings, which are otherwise very interesting. I don't know why? Saviour of mankind complex? Slashdot damage? You have a comfortable position at a university -- haven't you? )

A reasonable approach for any kind or program development including programming languages is starting with use cases. What should be expressed, how should it be used -- always including feedback from previous experiences of other languages of course. Excluding the user can be done later at any stage when one formalizes or hacks around certain kinds of conceptual/technical problems that involve mostly the compiler machinery i.e. affects usability but is hidden from immediate use. I do not consider designing a language as much different from say designing a forum software. There is always a subtle interplay between features and usability and knowing/evaluating a language as a user and not as a researcher is quite legitimate. Some languages are radical innovative featurewise while others are smoothing usability. I guess this won't go away independent of the scientific progress you and other researchers will make.

PS. Best wishes with Arrow. The type normalization you seem to go for reminds me of simplifications in computer algebra systems -- at least from a distance and with my limited understanding of type theory so far. I'm not sure if those problems are for real though and if ML programmers demand for this?

If this community were homogenous enough...

If this community were homogenous enough to design a programming language by consensus, we'd be too boring to talk about programming languages with.

Once static vs. dynamic typing comes up...

...then that's it for collaboration :)

Personally, I think the key feature a new language needs is implementability. The tunes.org people went off the deep end for years, with nothing to show, for example.

Maybe we should make two languages...

One static and one dynamic. But then we should also have functional, procedural, declarative and OO languages, for those that can't agree on overall structure. We would need to further divide OO into several types of class-based, classless or prototype based, and multiple dispatch / CLOS-like systems. Functional would need to be divided into at least pure and impure. In the end we'll hopefully have suitably divided the problem space sufficiently that everyone can find a suitable language to work on. The only problem may be that we'll end up with nearly as many languages as LtU members ;-).

Language System

I know we're not really supposed to hash out ideas here, but I'm mentioning these here both as features of a language system, and questioning if they're possible. So I don't know if that fits in the realm of questions or ideas.

Looks like this thread has kinda died, but I saw a few of you mention making a language system- by that, would it mean a program which could utilize any language? If so, I had a few ideas as to what one could do with that. Using a base language like Perl, you could make a parser to handle the processing of multiple languages.

Make libraries nameable and importable into a database of some sort for the language system itself, and just have the parser search for something like . It'd allow different languages to be used within the base language easily... You could use it for cross platform programming, and the like... You could make the parser pick up variables in another language's format and store them in a variable of the same name in the base language...

I also had some ideas on a construct called an order, but I don't know exactly what problems it could solve. Have the option of tagging code with order levels and names to be tied to that specific one... Sort of modularized, multi-language abstracted code. Code processing would start with the highest order, and work its way down... And you could have order tags within orders, assuming it wasn't the same order level... After it had been processed down to order 0, it would run through the code and process anything not within the order tags... Of course, that'd necessitate functions that could prevent the processing of a specific order-module, or an entire order, or several orders, etc.

I figure that would be most useful for metaprogramming... Alongside another construct idea I had, I think it'd be able to do just about anything in that area... I originally conceived it as exphp (I work on a game written in PHP, so it's the language I know best): eval that directly alters the source. I know it's a security risk, but it's a valuable tool as well... You'd be able to dynamically construct variable names on the spot, instead of having to do it in a roundabout way...

Is what I'm thinking of even possible? Would it solve any problems? Would it take too much memory/time to process? Because if it'd be both possible and not unrealistically resource intensive, I think it'd be a cool project to work on.

Re: Let's make a programming language!

"Since LtU members are so knowledgable on programming languages, why don't we design the (ultimate) programming language? let's all post our suggestions here to make a lean-and-mean programming language. It would be a experiment filled with fun. I can make the compiler, if a 'standard' comes out of this. I apologise if this has been proposed before (and obviously failed, since there is nothing out).

"My initial suggestions are two:

"1) put as less as possible into the compiler. Make the language so versatile that all high-level concepts can be easily done with the core constructs.

"2) the lowest level of the language should map directly to the hardware, i.e. must be some kind of assembly, in order to allow for very low-level programming. But the language should be versatile enough as to provide the means for doing layers upon layers of abstraction, so one chooses the appropriate level of abstraction for each particular type of application."

Well, at first glance, some kind of "happy marriage" of Forth and Scheme seems to be what you're looking for here. But I think every programmer who knows both Forth and some form of Lisp has come to the same conclusion, and there are actually some "functional Forths" out there.

So what I think is missing is a "bare-metal Scheme" -- a Scheme interpreter/compiler/assembler with some knowledge of concurrency. Maybe Gambit/Termite would be a good starting point for this.

Let's build a better backend instead!

Syntax is much less interesting than solving the really hard problems of making an efficient optimizing compiler and runtime system. So mainly what you want is to decide on a set of features that require backend support, then build the backend. Once you have that, then you can have any syntax you want on top of it. Remember that parsing programming languages of interest is a solved problem ;-)

OK.... but... I have somthing else in mind...

I am wanting to build one too... but I am not sure if its what you want or not... I have started figuring out how I want it but I need a compiler... and I was wondering if you wanted to help me with the compiler? if so maybe do you want to talk over im?

Thanks! Trey.

Question

Can anyone answer to tell me if we are trying to make this object oriented because if it is we can definitely uses class, and methods like Java and C#. With exceptions to objects. If you like my idea please say so.

We can also make dynamic and static classes. The dynamic classes will be the exception and the static class will take the characteristics that precede it.

Here is a version i thought would be nice

(include(picture.jpg)1) //The parentheses that contain picture.jpg will be the angled brackets. Like the ones that specify a tag in html. Here is where you put all the references to the files that will be needed to make your program a success.//
//This is a note. The one at the end can be any letter or number telling it which class to give its characteristics to in this case class 1.//

method 1{
class 1 {
function title() { //Places title in title bar//
disp("sjfhkal"); //disp means display
nl; //new line//
endl; //end line//
}
function rotate(picture.jpg)(x){
x = 90;
}
} //end method 1//
//You can put multiple classes in a method//
//Can someone help me embellish//

Yuck! The title is "Let's

Yuck! The title is "Let's make a Programming Language", not "Let's make another Java/PHP". :)

If there's yet another imperative programming language, I hope they at least get rid of {} blocks and ; statement finalizers. Oh, and = for assignment! ;)

Few more things

You can write dynamic in angled brackets to make an object be an exception to the class it is in.

"Needs more objects..."

So we've all agreed...we're writing a prototype-based applicative systems programming language with syntax derived from Lisp, APL and C++.

we've all agreed

LOL.

(will Ehud punish me for this utterly unsubstantial comment? At any rate, the previous comment makes me laugh)

LOLing is warranted from

LOLing is warranted from time to time (and given the age of this thread, the discussion is somewhat surreal anyway).

This suggests a paper

Lambda, the Ultimate Surrealism.. We need to convince Guy and Gerry to complete the set. :-)

Instead...

Can we change C++ to C#? It would add to the versatility of the language.

A wiki for fleshing out these concepts

Everyone's chiming in with their own thoughts and there is a lot of confusion. I set up a wiki at http://lambdalang.deluxecode.com/ so people could contribute and turn these ideas into an actual specification for one or more languages. I also got the ball rolling with some ideas from the thread here. Go to the wiki at http://lambdalang.deluxecode.com/ and contribute your thoughts.