hi all,
I'm a newbie here but I assure you I'm not a troll. I was reading about Pluvo and then it struck me...
so you are bored and want to write a new language. what do you do at first? plan a look and feel. well, different people come from different backgrounds and they don't like the same look and feel. but then, it basically means you write a lexer, a parser, and then finally the interpreter. too much work. and it's not reusable by any other bored programmer. it's fun doing it again, of course. :)
I was thinking, what about having a rough lexical standard for interpreted languages? the point is that with clear specification everyone can write a lexer/parser and then extend it. hopefully, the base will already be there.
of course, different languages are different...
but I think some guidelines are absolutely necessary. not for the expert programmers, they can cope with anything, but the objective is to reduce the learning time for students.
so here are some imaginary rules. each idea is independent of the others. if you like them, that would probably convince you that it is indeed useful to do something like this.
- source codes are XML files. there are four different types of text: documentation, annotations, code and comments. an IDE is a must so that full advantage of this formulation can be taken. the basic principle is that source code itself is valuable data. something like:
<source language = "imaginary" target = "2.1">
<documentation>
The classical "Hello World" program
</documentation>
<code>
<annotation visibility = "public" />
<comment>the exact syntax is not relevant here</comment>
say "Hello World"
</code>
</source>
</documenation>
too verbose for text editors, but not so for an IDE. parsing XML is easy. displaying different things in different colors is.. well.. depends on you. but text editors like gedit, emacs, jEdit, Notepad++, know what to do.
- the encoding is always Unicode. this basically means you can use mathematical symbols as part of your language. advantage? well, it's been like 40 years or so of ASCII based programming, right?
-
there should be an all-pervasive naming convention. either camelCase or if the language permits, hyphen-seperated-words. but not both.
- long names are preferred. so no more MVar or mapM_ or __init__ and scoping rules apply. Monad.map looks pretty (over mapM). scaring off uni students is not going to do any good.
- the languages should have a clear-cut way of communicating with the outside world. the names exported should not contain any eccentricities (should be pure ([alpha][alpha|num]*)).
- for variable names (or function names), same name with different annotations should not have different meaning (I don't hate Perl, but I think @list and $list having completely different meanings is not healthy).
-
there will be no way to write two different statements on the same line (like ; separated ones in C). block structure dictated by indentation. this suggestion brings up the controvertial idea of 'good taste' again. I think once someone implements a very good library for dealing with one particular standard like these, that library, hopefully very well documented, people will accept the standard just the way it is. may be they will demand minor tweakings here and there, but not more than that. psychology is a really really mysterious thing.
things like these... the problem with interpreted language is that they are very concise, in my humble opinion, too concise. of course, diving any deeper will cause so much difference in opinions that I should probably stop here.
Recent comments
27 weeks 6 days ago
27 weeks 6 days ago
27 weeks 6 days ago
50 weeks 1 day ago
1 year 2 weeks ago
1 year 3 weeks ago
1 year 3 weeks ago
1 year 6 weeks ago
1 year 11 weeks ago
1 year 11 weeks ago